Local versus non-local information in quantum information theory: formalism and 

phenomena 



^1" 
O 

o 

(N 

■4— > 

o 
O 

(N 



> 
O 
On 
O 
O 

o 

Ok 

> 

c 

CT 1 - 



Michal HorodcckiW, Pawel Horodecki^ 2 ), Ryszard HorodeckiW, Jonathan 
OppenheimW( 3 )( 4 ), Aditi Sen(De)( 1 )^ 5 ) , Ujjwal Sen«'( 5 ) and Barbara Synak« 
^'Institute of Theoretical Physics and Astrophysics, University of Gdansk, 80-952 Gdansk, Poland 
^Faculty of Applied Physics and Mathematics, Technical University of Gdansk, 80-952 Gdansk, Poland 
Dept. of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, U.K. 



(3) 



^ Racah Institute of Theoretical Physics, Hebrew University of Jerusalem, Givat Ram, Jerusalem 91904, Israel ana 
^ Institut fur Theoretische Physik, Universitdt Hannover, D-30167 Hannover, Germany 

(Dated: October 4th, 2004) 

In spite of many results in quantum information theory, the complex nature of compound systems 
is far from being clear. In general the information is a mixture of local, and non-local ("quantum") 
information. It is important from both pragmatic and theoretical points of view to know relation- 
ships between the two components. To make this point more clear, we develop and investigate 
the quantum information processing paradigm in which parties sharing a multipartite state distill 
local information. The amount of information which is lost because the parties must use a classical 
communication channel is the deficit. This scheme can be viewed as complementary to the notion 
of distilling entanglement. After reviewing the paradigm in detail, we show that the upper bound 
for the deficit is given by the relative entropy distance to so-called pseudo-classically correlated 
states; the lower bound is the relative entropy of entanglement. This implies, in particular, that 
any entangled state is informationally nonlocal i.e. has nonzero deficit. We also apply the paradigm 
to defining the thermodynamical cost of erasing entanglement. We show the cost is bounded from 
below by relative entropy of entanglement. We demonstrate the existence of several other non-local 
phenomena which can be found using the paradigm of local information. For example, we prove the 
existence of a form of non-locality without entanglement and with distinguishability. We analyze the 
deficit for several classes of multipartite pure states and obtain that in contrast to the GHZ state, 
the Aharonov state is extremely nonlocal (and in fact can be thought of as quasi-nonlocalisable) . 
We also show that there do not exist states, for which the deficit is strictly equal to the whole 
informational content (bound local information). We discuss the relation of the paradigm with mea- 
sures of classical correlations introduced earlier. It is also proven that in the one-way scenario, the 
deficit is additive for Bell diagonal states. We then discuss complementary features of information 
in distributed quantum systems. Finally we discuss the physical and theoretical meaning of the 
results and pose many open questions. 
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I. INTRODUCTION 



"Quantum information" is emerging as a primitive notion in physics following an essential extension of classical 
Shannon information theory [jj into the quantum domain. Quantum information can not be defined precisely, but it 
is necessary to understand the role of this mysterious and " unspeakable" information [2j in newly discovered quantum 
phenomena such as teleportation Q or cryptography Q [f| . These phenomena strongly suggest that quantum states 
represent quantum information - reality we process in the laboratory, but which can not be described as a sequence 
of classical symbols on a Turing tape 0, ■ Recently the no-deleting and no-cloning theorems have been connected 
with the principle of conservation of quantum information 0. Like physical quantities such as energy, quantum 
information has different forms and one of them is entanglement - an exotic resource extraordinarily sensitive to the 
environment. One finds a loss of entanglement in the transition from pure entangled state to noisy entangled state, 
yet remarkably this process can be partially reversed within the distance labs paradigm. Namely from a large number 
of noisy bipartite states shared between two distant parties one can distill a number of e-bits at the optimal conversion 
rate using local operations and classical communications (LOCC) 0. 



3 



Despite a plethora of measures which can be used to quantify entanglement, we are still far from properly under- 
standing it. Part of the difficulty is that measures of a quantity are not enough to understand the quantity - one 
needs to understand entanglement in relation to something else. You cannot understand entanglement in relation to 
entanglement. In the above context, basic questions arise: i) Does entanglement exhaust all aspects of quantum infor- 
mation? ii) Are there resources other than entanglement in the distant labs paradigm? Hi) Does quantum information 
involve a nonlocality which goes beyond Bell theorem? 

The above questions have been recently considered 0, EH El EL E3 El El E3 EH E3 I n particular^ new 
quantum information processing paradigm has been introduced, where we proposed the idea of attributing cost to 
local resources such as pure local qubits 0, 0] . Instead of asking how much entanglement can be distilled from a 
state shared between two parties, one can ask how many local pure qubits can be drawn from it. This gives a 
quantity (called localisable information) which can then be used to get insight into the double nature of quantum 
information. Namely, it was shown that local information can be thought of as being complementary to entanglement 
|l6l |. thereby allowing one, in particular, to understand entanglement in relation to Ii. 

At first glance, the idea of considering local pure states to be a resource may seem curious. In traditional entangle- 
ment theory, one thinks of local pure states as being a free resource. Each party can use as many pure state ancillas 
as desired. Furthermore, one can obtain pure states from a mixed state, simply by performing a measurement on the 
state. Note however, that the second law of thermodynamics tells us that purity is indeed a resource. One can never 
decrease the entropy of a closed system; entropy only increases. The reason a measurement appears to produce pure 
states is that we ignore the fact that the measuring apparatus must have initially been set in some pure state, and 
after the measurement, the apparatus will be in a mixture of all the different measurement outcomes. In other words, 
in a closed system which includes the state, the measurement apparatus and the observer, the total number of pure 
qubits can never increase. We must therefore be careful how we define the allowable class of operations in order to 
account for all pure states which might be introduced by various parties from the outside. We will discuss such a 
useful class, called Closed Operations which can properly be used to account for pure states. 

By considering pure states as a resource, one is immediately connecting quantum information theory with ther- 
modynamics. In fact, it was the early foundational work on reversible computation |2l| where the entropic cost of 
computation was considered [22J. This became one of the cornerstones which led to the possibility of quantum com- 
putation. The relationship between information and physical tasks such as performing work, also has a long history 
beginning with Szilard |23| . In fact, as shown in |l5l l'2'l| the information function is exactly equal to the number of 
pure qubits one can extract from a state while having many copies of the state. We will therefore talk of extracting 
information I from a state. One can think of this as extracting pure states from more mixed states. From the work 
of Szilard, we also know that the information / is closely related to the amount of work one can extract from a single 
heat bath (see [2{| for a rigorous derivation). These connections will be discussed in Section ITT1 were we review the 
basic concepts. 

The rough essence of the approach is that if separated individuals extract local pure states (i.e. information) from a 
shared state, using only local operations and classical communication, then they will in general be able to extract less 
information than if they were to geth er. If the amount of information they can extract when they are together from a 
state g is 1(g), and the optimal 26] amount they can extract when separated is Ii(g), then the difference (called the 
deficit) A(g) = 1(g) — Ii(g) feels some non-classical correlations in the state g. or 

Note that the quantity A is not an entanglement measure, at least in the regime of finite copies of a state g. It 
feels not only entanglement, but also, so called non-locality without entanglement [To| . We say that it quantifies the 
quantumness of correlations rather than entanglement (first attempts to formally quantify such features for quantum 
states are due to and for ensembles in 0]). The state which has nonzero deficit we will call " informationally 
nonlocal". The term nonlocality means here that distant parties can do worse than parties that are together, despite 
the fact that they can communicate classically [Tol ] . Thus it is a different notion than the nonlocality understood as 
a violation of local realism (we have discussed the relations in 1271 ). 

In this work we review some of the results of El ll^l l24| and provide more detail. We then give a number 

of new, essential results within the paradigm of distillation of local information. In particular we provide a lower 
bound for the deficit: it is bounded from below by the relative entropy of entanglement [28l |29|. We also find that 
the CLOCC paradigm allows to define thermodynamical cost of erasure of entanglement. The cost is also bounded 
from below by the relative entropy of entanglement. We also analyze the deficit for multi-party pure states such as 
the Aharonov state, GHZ state and W state. We obtain that, according to the deficit, the Aharonov state exhibits 
the greatest quantum correlations, while GHZ state - the least. In fact, the Aharonov state can be said to be 
quasi-non-localisable, in the sense that in large dimension, the fraction of information which can be localised goes to 
zero. We show that in the finite regime (i.e where Alice and Bob deal with a single copy of a state), any entangled 
state is informationally nonlocal i.e. it has nonzero deficit. Moreover, we provide states which exhibit informational 
nonlocality even though they are separable and have an eigenbasis of distinguishable states - call it non-locality without 
entanglement but with distinguishability (on the level of ensembles, it has its counterpart in E3)- Wis also provide 
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many other interesting results, including the impossibility of catalysis with local pure states, and non-existence of 
states whose entire informational contents is non-localisable. 

The paper is organized as follows. After the Introduction, in section [H] an operational meaning of information is 
briefly recalled in terms of transition rates and basic laws of thermodynamics. In section ITTT1 the idea of information 
as a resource in distant labs paradigm is presented. Here the central notion of the present formalism i.e. quantum 
information deficit is defined. In section llVl the various aspects of information deficit and its dual notion localisable 
information axe discussed and an interpretation of the deficit in the context of quantum nonlocality is provided. 
Section [V] presents deficit as entropy production needed to reach set of pseudo-classically correlated states. The 
concept is then generalized to arbitrary set, including set of separable states, and cost of erasure of entanglement is 
defined. Section lVIl Drovides upper and lower bounds for deficit in terms of relative entropy distance and upper bound 
for entanglement erasure cost. 

We next turn to exploring new phenomena which can be discovered using our methods. In section IVIII main 
implications of the results of previous section are provided including the key conclusion that any entangled state is 
informationally nonlocal in a well defined, natural sense. We also prove the existence of separable states which have a 
locally distinguishable eigenbasis, yet contain nonlocalisablc information. Scction lVlIIl is devoted to the generalization 
to the multipartite case. Some of these results were briefly noted in |14|. Here information deficit is calculated and 
the asymptotic behavior is analyzed for special examples of pure multipartite states: GHZ state, W-state, and the 
Aharonov state. We find that the Aharonov state can be considered to be the most non-local. Section ITXl contains 
exhaustive analysis of Bell states. In section[X]we prove that (as a opposed to - pure nondistillable entanglement i.e.. 
the bound entanglement phenomenon) pure unlocalisable information does not exists. Section IXII includes analysis 
of the proportions of quantum and classical correlations in quantum states, addressing the question: can the first 
component exceed the second? In section TXIII zero- way and one-way subclasses of informational deficit are presented. 
It is shown that in asymptotic version one way deficit is nonzero for separable (disentangled) states, stressing that 
quantum correlations is more than quantum entanglement. Section IXIIII discusses relation of our measure to other 
measures of quantumness of correlations i.e.. one-way and two-way quantum discord is discussed. Section IXIVI 
contains discussion of the result in the context of classical correlations measure introduced by other authors including 
the Henderson- Vedral measure. Discussion of complementarity between information quantities in distributed quantum 
systems is provided in section IXVl The paper closes with general discussion of the results and a list of open questions 
m section IXVT1 

II. INFORMATION: AN OPERATIONAL MEANING 

Before turning to the case of parties who are in distant labs, it will prove worthwhile to discuss the notion of 
information from a more general perspective. Although we often talk about information as an abstract concept, here, 
we use it as a term of art which refers to a specific function 

I(g) = \og 2 d-S( e ) (1) 

where S(g) — — trglogg is the von Neumann entropy of g acting on a Hilbert space of dimension d. We will usually 
work with qubits, in which case logc? = TV an integer. As we will see in next section, so defined information function 
has operational meaning: it is number of pure qubits one can draw from many copies of the state. 

Let us now shortly discuss the information function Q in the context of more common Shannon picture. In the 
latter approach a source produces a large amount of information if it has large entropy. Thus information can be 
associated with entropy. This is because the receiver is being informed only if he is "surprised" . In such an approach 
the information has a subjective meaning: something which is known by the sender, but is not known by receiver. 
The receiver treats the message as the information, if she didn't know it. However, one can also consider an objective 
picture; a system represents information if it is in a pure state (zero entropy). We know what state it is in. The state 
is itself the information. 

We obtain a dual picture, where two kinds of information are dual. Shannon's entropy represents the information 
one can get to know about the system, while the information of Q represents the information one knows about the 
system. Together they add up to a constant, which characterizes the system only (not its particular state) 

I {total) = log d = S{p) + I(p) (2) 

Note that the "objective" picture is more natural in the context of thermodynamics. There, a heat bath is highly 
entropic, and we are ignorant of exactly what state it is in. On the other hand, it is known that using pure states, 
one can draw work from a single heat bath using a Szilard heat engine |23| The pure state represents information 
needed to order the energy of the heat bath. Knowing which side of a box the molecules of gas are in, allows one 
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to draw work by having the molecules push out a piston, 
positions, and an inability to draw work from the system, 
of a system in state p, one can draw amount of work (cf. 



High entropy of the gas, implies ignorance of the molecule 
In general from a single heat bath of temperature T by use 



W = kTI (3) 

The process does not violate the second law because the information is depleted as entropy from the heat bath 
accumulates in the engine, and one cannot run a perpetual mobil. Thus a quantum system in a nonmaximally mixed 
state can be thought of as a type of fuel or resource. In fact, originally, our motivation for considering the function^ 
in prj was to understand entanglement in a thermodynamical context. 



A. Information and transition rates 

in mm it was shown that the function / has operational meaning in the asymptotic regime of many identical 
copies. It gives the number of pure states that one can obtain from a state g under a certain class of operations we 
call Noisy Operations (NO): operations that consist of (i) unitary transformations (ii) partial trace and (iii) adding 
ancillas in maximally mixed state. First, one can show that it is the unique function (up to constants) that is not 
increasing under the class NO. One then shows that I determines the optimal rate of transitions between states under 
NO. Let us now discuss two special cases. 

First, given n copies of state g one can obtain nl(g) qubits in a pure state. This is done essentially by quantum data 
compression [3lJ (cf. H^H^)- In data compression, one keeps the signal, and discards the qubits which are in the 
pure state. Here we do the opposite. We discard the "signal" , treating it as noise, and keep instead the redundancies 
(that are in pure state). Thus we obtain pure states. This is essentially like cooling (24J. The protocol does not 
require using noisy ancillas (e.g. maximally mixed states). 

A second protocol of interest is that one can take n(N — S(g)) pure qubits, and produces n copies of g. The protocol, 
described in |24j . takes pure states and dilutes them using ancillas in the maximally mixed state (noise). Existence 
of such dual protocols is similar to entanglement concentration and dilution [3{|. And, similarly as in |36l l37T]. this 
can be used to prove that there is unique function that does not increase under NO class of operations. 

Note that for K pure qubits, information is equal to K. For the maximally mixed state 1 = 0. As mentioned, / is 
monotonically decreasing under partial trace, and adding ancilla in maximally mixed state. It is of course constant 
under unitary operations. The property that makes it a unique measure of information in the asymptotic regime 
is asymptotic continuity (see [37], |3& |3!j) which means that if two states are close to each other, then so is their 
informations per qubit. It is important to remember that I is not expansible, i.e. if we embed the state into larger 
Hilbert space, then it changes (because the number of qubits increases). The reason is obvious even within the classical 
framework: if there are two possible states of the system, knowledge of the state represents less information than 
knowledge of the state in the case of, say, three possible configurations. It is in contrast with entanglement theory 
where a pure state of Schmidt rank two means always the same thing, independently of how large the system is. Also 
entropy of the state depends only on nonzero eigenvalues: e.g. the entropy of a pure state is zero, independently of 
how large the system is. However in the present case, the Hilbert space and its dimension is important element of our 
considerations. 



B. Information in context of "closed operations" 



In previous section we have argued that information function gives transition rates from mixed state to pure and 
backwards, and that it gives uniqueness of information in the context of NO. For the rest of this paper we will not 
treat additional mixed state as free resource. Thus let us now discuss the meaning of information in the context of 
a class that is compatible with the class of operations which we will use in the case of distributed systems further in 
this paper. Namely, we can consider closed operations (CO). They are arbitrary compositions of the following two 
basic operations: 

(i) unitary transformations 

(ii) dephasing p — > PipPi where J^. P, = /, and Pi are projectors not necessarily of rank one. 

We call the class closed, though it is not actually fully closed. The information cannot go in, but can go out (via 
dephasing). The name closed is motivated by the fact that the number of qubits is the same, and the qubits cannot be 
exchanged between the system of interest and environment. The only allowed contact with environment is decoherence 
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caused by operation (ii). In next section we will introduce a "closed" paradigm to distant labs scenario, by use of 
which we will define quantum deficit. 

Now, let us ask what about drawing pure qubits out of given state by the present class of operations. The operations 
do not change the size of the system, so that when we start e.g. with many copies of state p we cannot end up with a 
smaller system in almost pure state. However this is not a big problem, Imagine for a while that in addition we can 
apply partial trace (which is not allowed in CO). Then the process of drawing pure qubits can be divided into two 
stages: 1) some CO operations aiming to concentrate the pure part into some number of qubits, and 2) partial trace 
of the remaining qubits. 

Since we do not allow for partial trace, one can simply stop before tracing out. The obtained state will have a form 
of (approximate) product of qubits in pure state and the rest of the system - some garbage. Thus the process of 
dividing system into pure part and garbage we can treat as extraction pure qubits. 

Now, let us ask how many pure qubits can be drawn from a state by closed operations in the above sense? Actually, 
the process of drawing qubits by NO didn't use maximally mixed states. It was just a unitary operation, plus partial 
trace. Thus we can apply this operation (unitaries are allowed in CO) and get again I qubits per input states. Thus 
also within " closed picture" information has the same interpretation of maximal amount of pure qubits that can be 
obtained from a state per input copy, by closed operations. 

III. RESTRICTING THE CLASS OF OPERATIONS IN DISTANT LABS PARADIGM: CLOCC AND 

THE INFORMATION DEFICIT 

In the preceding section, we discussed the notion of information from the perspective of being able to reversibly 
distill pure states from a given state g. Now, one can ask about how things change when the allowable class of 
operations one can perform are somehow restricted. This is a rather general question, but since here we are interested 
in understanding entanglement and non-locality, we will examine the restricted class of operations which occurs when 
various parties hold some joint state, but are in distant labs. One then imagines that Alice and Bob wish to distill 
as many locally pure states as possible i.e. product pure states such as \0® mA )A <g> |0® ms )_B. The amount of local 
information which is distillable, we call 

In the ordinary approach to the distant labs paradigm, one imagines that two parties (Alice and Bob) are in distant 
labs and can only perform local operations and classical communication (LOCC). However, as we noted, this class of 
operations is not suitable to deal with the questions of concentration of information to local form. That is because 
under LOCC, one does not count the information that gets added to the systems through ancillas, measuring devices 
etc. We thus have to state the paradigm more precisely. Since we are interested in local information, we must treat 
it as a resource, assuming it cannot be created, but only manipulated. Once we have a compound state, the task is 
to localize the information by using classical channel between Alice and Bob 

The new paradigm was introduced in Ref. 01 where one essentially looks at a closed system as one does in 
thermodynamics when calculating changes in entropy. One imagines that Alice and Bob are in some closed box, and 
don't allow them to import additional quantum states, except for ones which we specifically keep track of and account 



In defining a class of operations, the crucial point is that here, unlike in usual LOCC (local operations and classical 
communication) schemes, one must explicitly account for all entropy transferred to measuring devices or ancillas. So in 
defining the class of allowable operations one must ensure that no information loss is being hidden when operations are 
being carried out. Moreover the operations should be general enough to represent faithfully the ultimate possibilities 
of Alice and Bob to concentrate information. In other words we wouldn't like to introduce any limitation apart from 
two basic ones: (i) there is classical channel between Alice and Bob (ii) local information is a resource (cannot be 
increased) . 

We consider a state qab acting on Hilbert space Hab = ~Ha <8> He- Let us first define the elementary allowable 
elements of Closed LOCC operations (CLOCC). 

Definition 1. By CLOCC operations on bipartite system of uab qubits we mean all operations that can be composed 
out of 

(i) local unitary transformations 

(ii) sending subsystems down a completely decohering (dephasing) channel 
The latter channel is of the form 



for. 




(4) 



7 



where Pj are one-dimensional projectors. For a qubit system, it acts as 



Qu 


Q\2 


— * Qout — 


Qll 





_ 921 


622 





622 _ 



(5) 



It is understood, that Qi n is at the sender's site, while g ou t is at the receiver's site. The operation (ii) accounts for 
both local measurements and sending the results down a classical channel. It can be disassembled into two parts: a) 
local dephasing (at, say, sender site) and b) sending a qubit intact (through a noiseless quantum channel) to receiver 
Thus suppose that Alice and Bob share a state qab = QA>a>>b, and Alice decided to send subsystem A" to Bob, down 
the dephasing channel. The following action will have the same effect. Alice dephases locally the subsystem A" 



QA'A"B 



Ia'bqa'A"bPi 



A" 



I A 1 



(6) 



The state is now of the form 



out 
QA'A" 



A" 



A'B 



(7) 



Thus the part A" is classically correlated with the rest of the system (it is stronger than to say that the state is 
separable with respect to A" : A'B). Now Alice sends the system A" to Bob through an ideal channel. Thus the final 
state differs from the state Q°a^A"b on ly m that the system A" is at Bob site. It follows that the operation can be 
replaced by the following two operations 

(iia) Local dephasing 

(iib) Sending completely dephased subsystem. 

Note that operations (i) and (iib) are reversible. Only the operation (iia) can in general, be irreversible. Actually it 
is irreversible, if only it changes the state, i.e. in all nontrivial cases. Note also that the operations do not change 
the dimension of the total Hilbert space, or, equivalently, the number of qubits of the total system, even though the 
particular qubits can be reallocated; for example at the end all qubits can be at Alice's site. 

Let us finally ,note that it may happen that after the protocol, one of the parties will be left without particles at 
all, as all have been sent to the other ones. It is only the total number of particles which is conserved. 



A. Comparison with other class of operations 



For the purpose of the present paper, we will use solely CLOCC operations. Yet, since in some of our papers a 
different class of operations was employed, we will describe the other class and compare it with CLOCC. 

Let us first present the other (likely equivalent) class of operations, called Noisy LOCC (NLOCC). The relation 
between NLOCC and CLOCC will be similar to the relations between NO and CO: the elementary operations will be 
the same as in CLOCC, plus tracing out local systems and adding maximally mixed ancillas. 

Definition 2. By NLOCC operations on bipartite system of uab qubits we mean all operations that can be composed 
out of 

(i) local unitary transformations 

(ii) sending subsystem down completely decohering (dephasing) channel 
(Hi) adding ancilla in maximally mixed state 

(iv) discarding local subsystem 

As in CLOCC we can decompose (ii) into (iia) and (iib). CLOCC operations is more basic than NLOCC. Namely, 
the latter can be treated as CLOCC with additional resource: unlimited supply of maximally mixed states (which 
have zero informational contents). Indeed, similarly as in section lll Bl one can argue that the operation of local partial 
trace is not essential. 
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IV. LOCALISABLE INFORMATION AND INFORMATION DEFICIT 



In this section wc define the central quantity information deficit. To this end we will first introduce notion localisable 
information. We will first deal with single copy case, and define basic quantities on this level. Then we will discuss 
the asymptotic regime, which will require regularization of the quantities. 

Definition 3. The localisable information Ii(qab) of a state qab on Hilbert space C dA ®C ds is the maximal amount 
of local information that can be obtained by CLOCC operations. More formally: 

Hqab)= sup (I(g' A ) + I(g' B )) (8) 

AeCLOCC 

where g' AB = A(g AB ), I * s information function I(g' x ) = N' x — S(g' x ); N' A — logd^, N' B = logd' B are number of qubits 
of subsystems of the output state. When one of numbers of qubits is zero (null subsystem) we apply the convention 
that information is zero. 



Alternatively, we have formula: 



Hqab) = N- inf (S(g' A ) + S(g' B )) (9) 

AeCLOCC) 



where N is total number of qubits. Again, if it happens that all particles are with one party (i.e. the output dimension 
is equal to one) so that the subsystem of the other party is null, then we apply convention that the entropy of such 
subsystem is zero. Further state on system with one subsystem null we will call null-subsystem states. 

It is important here, that "to obtain local information" does not mean as usual getting some outcomes of local 
measurements. Rather it means, to apply such operation, after which, information, as a function of states of subsys- 
tems will be maximal. Thus, we only deal with state changes, and calculate some function (information function) on 
states. 

Actually is not localisable information which will be the most important quantity. Rather, the central quantity 
is a closely connected one, which we call quantum information deficit (in short quantum deficit). It is defined as a 
difference between the information that can be localized by means of CLOCC operations, and total information of 
the state. 

Definition 4. The quantum deficit A(gAB) of a state gAB on Hilbert space C dA ® C dB is given by 

&{qab) = i{gA B ) - Il(QAB) (10) 
Using definition of localisable information Ii , we get alternative formula for quantum deficit 

A = inf (S(g' A ) + S(g' B )) - S(g AB ). (11) 

AtOLC/OO 

where g' AB = A(g AB ). 

It is important to notice that both quantities are functions not only of a state but also the dimension of the Hilbert 
space. This is because CLOCC operations are defined for a fixed Hilbert space. That depends on dimension of the 
Hilbert space is even more obvious, because the latter is explicitly written in the formula. However, in the formula 
for deficit as written in Ijllfl , the dimension does not appear explicitly, so it could happen that there is no dependence 
on dimension. Actually, it is rather important that A does not actually depend on dimension, i.e. when one locally 
increase Hilbert space, by e.g. adding a qubit in pure state, A should not change. This is because, as we will see later, 
the deficit will be interpreted as a measure of quantumness of correlations, which should not change upon adding local 
ancilla. We will discuss this issue later in more detail. In particular in section Q we will show that regularization of 
deficit does not change upon adding local ancilla in pure state. 



A. Interpretation of quantum deficit: measure of "informational nonlocality" 

Nonzero deficit means that Alice and Bob are not able to localize all the information contained within the state. 
This, however, means that part of information is necessarily destroyed in the process of localizing by use of classical 
communication. This part of information cannot survive traveling classical channels. It implies, that it must be 
somehow quantum. In addition, this part of information must come from correlations, since information that is not 
in correlations, is already local, and need not be localized. We could say that quantum deficit quantifies quantum 
correlations. However, we will see that quantum deficit can be (and often is) nonzero for separable states, which can 
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be generated by local quantum actions and solely classical communication. It is not clear then if we can talk here 
about quantum correlations, because can quantum correlations be created by only classical communication between 
the parties? However, quantum deficit being nonzero indicates that there is something quantum in correlations of 
the state. One can say, that these are classical correlations of quantum properties. We will then propose to interpret 
quantum deficit as amount of " quantumness of correlations" . 

Let us now discuss issue in the context of a notion of nonlocality considered by |lf)j | . The authors exhibited ensembles 
of product states which are fully distinguishable if globally accessed, but cannot be perfectly distinguished by distant 
parties that can communicate only via classical channel, Then they called this effect nonlocality without entanglement. 
The reason for using term "nonlocality" was the following: one can do better if the system is accessible as a whole, 
rather than when it is accessible by local operations and classical communication. 

In our case, the situation is similar: Alice and Bob can do better in distilling local information if they have two 
subsystems at the same place rather than shared in distant labs. Thus, we have similar kind of nonlocality, and 
quantum deficit is a measure of such nonlocality, which we can call " informational" , as it concerns difference in access 
to informational contents. Thus, any state with nonzero deficit will be called informationally nonlocal (or nonlocal, 
when the context is obvious). 

B. Classical information deficit of quantum states 

It is important to investigate not only "quantumness" of compound quantum states, but also the relationships 
between their "classical" and "quantum" parts. To this end consider quantity Ilo - the information that is local from 
the very beginning, i.e. 

I LO = N - S(p A ) - S{ PB ) (12) 
We will call it local information. We can now define an analogous quantity to quantum deficit, on a "lower" level. 

Definition 5. A classical deficit of a quantum state is a difference between local information and the information 
that can be obtained by CLOCC (i.e. by localisable information) 

A c = /, - I LO (13) 

This tells us how much more information can be obtained from the state by exploiting additional correlations in 
the state pab- We will refer to A c as the classical deficit, because the channel is classical. Also, as we will see later, 
the quantity can be used in the context of quantifying of classical correlations (though it is not immediate, see |40jp. 

C. Restricting resources: zero-way and one-way subclasses 

Additional measures of quantumness of correlations which arise when one restricts the communications between 
Alice and Bob. 

One can define the one-way (Alice to Bob) deficit (A - *) and one-way (Bob to Alice) information deficit (A <_ ) by 
restricting the classical communication to only be in one direction. Furthermore, one has also a zero-way deficit (A ). 
The name zero way is perhaps confusing. It refers to the situation where no communication is allowed between Alice 
and Bob until after they have completely dephased (or performed measurements) on their systems. After they have 
done this, they may then communication in order to exploit the (what are now) purely classical correlations in order 
to localize the information. These restricted deficits corresponds to locally accessible informations If*, If~ and if. 

D. Asymptotic regime: Distillation of local information as dual picture to entanglement distillation 

In this section we will argue that the idea of localization of information, though at a first glance exotic, can be recast 
in terms typical for quantum information theory, where of central importance are manipulations over resources. Even 
more, our present formulation will be analogous to the scheme which is a basis for entanglement theory: entanglement 
distillation. We will use the interpretation of the information function as the amount of pure qubits one can draw 
from a state in limit of many copies. 

Instead of singlets our precious resource will be pure local qubit. The aim of Alice and Bob is: given many copies 
of state qab to distill the maximal amount of local pure qubits by means of CLOCC operations, (in entanglement 
theory, we had LOCC operations, however here we need CLOCC, otherwise one could add for free states, and the 
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maximal distillable amount of pure local qubits would be infinite) . One way of doing that is the following: Alice and 
Bob take state qab, a Pply the CLOCC protocol that optimizes formula for localisable information i.e. they obtain 
state g' AB which has maximal local informations I' A and I' B . They apply such protocol to every copy of state they 
share. As a result they obtain many copies of state q' ab . Now, Alice in her lab, can apply protocol of drawing pure 
qubits out of her state f/® n , obtaining I' A pure qubits. The same does Bob. Finally, they possess I' A + I' B pure local 
qubits which is equal just to localisable information, and actually it is the best they can do, when act first on single 
copies using communication, and only locally perform collective actions on many copies. 

Alice and Bob could do better, when they act collectively from the very beginning. In this way we get that 
the optimal amount of local pure qubits that can be distilled by CLOCC is equal to regularization of localisable 
information: 

If° = hm w ' (14) 
n n 

Similarly we can define regularized quantum and classical deficit 

A- = lim fM^J. A oo =lim fM^J (15) 

n n n n 

Thus we conclude that regularizations of our quantities have operational meaning connected with amount of pure 
local qubits which can be distilled out of large number of copies of input state by means of different resources (global 
operations, CLOCC, local operations). Let us emphasize here, that when Alice and Bob are given single copy of state, 
they usually cannot distill pure qubits. When, they are given many copies, the ultimate amount of distillable pure 
qubits is described by regularized I\ . Thus the non-regularized quantity does not represent the amount of pure qubits 
that can be drawn either from single copy or from many copies. However, since in definition of Ii there is information 
function that has operational asymptotic meaning, then // also has some asymptotic interpretation, representing the 
amount of pure local qubits that can be drawn when at the stage of communication, Alice and Bob operate on single 
copies, and only after that stage operate collectively. 

In entanglement theory, there is similar situation with entanglement of formation and entanglement cost. The 
first is not the ultimate cost of producing a state out of singlets, though it already contains "some asymptotics" in 
definition - the von Neumann entropy, which is asymptotic cost of producing pure states out of singlet. The ultimate 
cost of producing states out of singlets is regularization of entanglement of formation. 

There is however some difference. Namely, even 7; itself, without regularization, has operational meaning. Indeed, 
it is proportional to the amount of work one can draw from a single copy of the state in presence of local heat 
baths. To draw optimal number pure qubits, one needs many copies. In one copy case, drawing pure qubits is highly 
non-optimal. However to draw optimal amount of work by use of state and single heat bath, one copy is enough, 
roughly speaking, in former case, law of large numbers (in other words, ergodicity) comes from many copies, while in 
the latter - ergodicity is "supplied" by heat bath. 

Finally, one can also consider amount of local information that can be distilled by means of one-way classical 
communication. It is equal to regularized one-way quantum deficit A - \ In similar vain we can consider regularizations 
of other quantities based on restricted resources, such as A^, A , A® ; etc. Again, all those regularizations have 
operational meaning. 

E. Additional local resources 

One of basic features of the paradigm is that adding local ancillas is not for free. The reason is that otherwise, 
all the quantities would become trivial. However there are two kinds of local resources that still can be taken into 
account. 

First of all, we can allow adding for free local ancillas in maximally mixed state. Thus given a state qab we can 
ask, what about the quantities of interest for the state qab ® -^r ■ Note here that this would mean that does not 
change, if we use NLOCC class instead of CLOCC. Indeed, as have already mentioned, the only difference, between 
two classes for the problem of distillation of local information, may appear when adding local maximal noise could 
help. In general, upon adding such local noise, localisable information could only go up. However it is more likely, 
that it will not change. In fact, Devetak has shown J2j} that one-way deficit does not change upon adding noise. We 
were not able to show the same in the case of two-way communication, though we believe it is also the case. 

Second possibility is borrowing local pure qubits. This would be most welcome, as it would mean that the deficit 
does not depend on dimension of the Hilbert space as discussed in the introduction of section llVl We actually show 
that it is the case for regularized deficit in Section [Xj For one-way case it is shown also in asymptotic regime in [20J. 
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There is more general possibility: borrowing local ancilla in any mixed state. However in asymptotic limit, this is 
actually equivalent to borro wing noise and pure qubits, as in that regime any state can be reversibly composed out 
of noise and pure qubits 0, l24| 

F. An example: pure states 

As we have mentioned, in our definition of quantum correlations, wc do not speak about entanglement at all. We 
do not work in the established paradigm of optimal rate of transformation to or from maximally entangled states |4l| . 
We consider distillation of pure product states. Thus, it was perhaps surprising to find [bH ] that for pure states, this 
definition of quantumness of correlations is just equal to the unique asymptotic entanglement for pure states |3fll4lj . 

We shall now see that by taking as an example the singlet 

|VO = ^(|oo}-|ii»- (16) 

It is a 2 qubit state of zero entropy, so it's informational content, as given by Eq. is J = 2. We will now see 
that Ii = 1. Clearly, without communicating, neither party can draw any information from the state, since locally, 
the state is maximally mixed. It turns out that the best protocol is for Alice to send her qubit down the dephasing 
channel. After she has done this, Bob will hold the classically correlated state 

Pec = i(|00)<00| + |ll)(ll|) (17) 

from which one can extract 1 bit of information by performing a cnot gate to extract one pure state |0). We thus 
have that A = 1. One can actually view this process in terms of measurements and classical communication, as long 
as we keep track of the measuring device. Alice performs a measurement on the state to find out if she has a |0) or 
|1). She then tells Bob the result. Bob now holds a known state, without having to perform any measurement. Alice 
on the other hand, had to perform a measurement to learn her state. The informational cost of the measurement 
is 1 bit since a measuring apparatus is initially in a pure state, and must have two possible outcomes. After the 
measurement, the measuring device needs to be reset. The classical state correlated state pcC: if field between two 
parties has A = 0. That the process is optimal for singlet state, is obvious, as this is actually the only thing which 
Alice and Bob can do given a single copy. However it is highly nontrivial to show that the regularization of // is still 
the same. The optimality of this protocol also in many copy case was shown in |15) . It also follows from the general 
theorem we give in this paper, which connects deficit with relative entropy distance from some set of states. 

In general, it is not hard to see that for an arbitrary pure state, the same protocol can be used with Alice first 
performing local compression on her state. For any pure state \">P)abi ^ ne two-way A is given by |l 4 Il5| 

A(|^» = S(tr A (\^) (VI)). 

Thus for pure bipartite states, quantum deficit is equal to entanglement. It is quite interesting that we have obtained 
entanglement by destroying entanglement. 

V. DEFICIT AS PRODUCTION OF ENTROPY NECESSARY TO REACH PSEUDO-CLASSICALLY 

CORRELATED STATES 

In this section we will show that the quantum deficit can be interpreted as the amount of entropy one has to 
produce in the process of transforming a given state into a so-called pseudo-classically correlated^] ■ This expression 
of deficit makes it possible to define entropy production connected with a given subset of states. For example we 
can then speak about the entropy production needed to reach the set of separable states. In this way our paradigm 
provides a consistent definition of thermodynamical cost of erasure of entanglement, while the original deficit can be 
called thermodynamical cost of erasing quantum correlations. 

A. Important classes of states 

Let us first define sets of states wfiich are important for our analysis. Notice that in place of a simple dichotomy 
between separable and entangled states > onc can have a whole hierarchy of levels of quantumness 153] . Already 
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Werner recognized |43, that within entangled states there might be ones that do not violate Bell's inequalities (cf. 
[44i l45| L One may also go in converse direction, and within separable states find a subclass which is most classical, 
and wider classes which are still somehow classical, though in some sense to a lesser degree (cf. [TUp. 

First, let us consider a set if states which we choose to call properly classically correlated or shortly classically 
correlated. These are states of the form 

e = 5*«l i M®IM'l (18) 

ij 

where {\i}} and {|j)}are local bases. Thus any such a state is classical joint probability distribution naturally embedded 
into a quantum state. Note that the set of classically correlated states is invariant under local unitary operations. 
The states are diagonal in a special product basis, which can be called biproduct basis. 

Now let us define the set of states of our central interest. We will call them pseudo-classically correlated states and 
denote by PC. These are the states that can be reversibly transformed into classically correlated ones by CLOCC. 
"Reversibly" means that no entropy is produced during the protocol. This implies that no dephasing is needed 
in transformations: Alice and Bob use only unitaries and sending such subsystems such that dephasing does not 
change the total state. Thus they can send only such subsystems X, that are in the following state with the rest R: 
Pxr = Pi \i) x (i | <£> Pi ■ The states that can be in such a way transformed into classical ones can be also described as 
the set of states which Alice and Bob can create under the allowed class of operations (CLOCC) out of classical states. 
The eigenbasis of these states was called an Implementable Product Basis or IPB in |l5| , since it is the eigenbasis that 
Alice and Bob are able to dephase in. 

Let us note that one can have an intermediate class, one-way classically correlated which are of the form 

= X)Pii|*}(*|®& (19) 

a 

These are states which can be produced out of classically correlated states by one-way reversible CLOCC. They are 

diagonal in basis which is of the form {Ji^k )} where {IV'fc')} are bases themselves. 

The above sets are proper subsets of separable states, and all the inclusions between them are proper too. 



B. Formula for quantum deficit in terms of pseudo-classically correlated states 

Any protocol of attaining the information deficit looks as follows: Alice chooses a subsystem of her system, dephases 
it, then sends it to Bob. Bob then chooses a subsystem from his system (which now includes his original system and 
the system sent by Alice). He dephases his chosen part, and sends it to Alice. They can send the states using 
an ideal channel, as the sent subsystems are already dephased. Thus sending is here only reallocating subsystems, 
nothing more. Alice and Bob continue such a process as long as they wish. When they decide to stop, the final 
step is p' and the obtained local information is equal to N — S(p' A ) — S{p' B ) while the initial total information was 
I = N — S(pab)- Thus the deficit obtained in a particular protocol P is A-p = S(p' A ) + S(p' B ) — S(pab)- Alice and 
Bob wish this quantity to be minimal. Suppose then that they preformed an optimal protocol, for which indeed this 
value is minimal. 

There are two cases: (i) one of subsystems is null (all particles with the other party) or (ii) both parties have 
subsystems that are not null. Note that in the second case the system must be in product state. Suppose it is not. 
Then, Alice and Bob can dephase state in eigenbasis of states of local subsystems. This will not change local entropies, 
but will transform state into classically correlated one. Then Alice can send her part to Bob, so that the information 
contents of the total state will be unchanged. However, if only the state was non-product, the total information was 
greater than sum of local informations. This means that the protocol was not optimal, so that we have contradiction. 

Thus we conclude, that the optimal protocol ends up with either product state or state of a system, which one of 
subsystems is null (all particles cither with Bob or with Alice). Even more, when a state is product, one of subsystems 
can be sent to other party, so that the whole system is with one party. This is compatible with the philosophy of 
" localizing" of information. 

However it turns out that we can divide the total process of localizing of information into two stages: 

• Irreversible stage: transforming input state g into some pseudo-classically correlated one g' 

• Reversible stage: localizing information of the state g'. 

In the first stage Alice and Bob try to produce the least entropy. The amount of information that they are able to 
localize is determined by this stage. In second stage, the entropy is not produced, and the information is constant. 
We have the following proposition 
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FIG. 1: CLOCC protocol of concentration of information to local form is a series of actions aiming to reach the set of 
pseudo-classically correlated states. The continuous lines denote reversible actions: sending dephased qubits or local unitary 
transformations. The dotted lines denote dephasings. The goal is to make the total entropy increase AS = ASi + AS2 + ■ ■ ■ 
minimal. Then the deficit is given by A = AS, because once the state is pseudo-classically correlated, its full information 
content can be localized. 



Proposition 1. The quantum deficit is of the form 

A = infW) - S(p)) (20) 

where the infimum is taken over all CLOCC protocols that transform initial state p into pseudo-classically correlated 
state p' . 

Proof. The proof actually reduces to noting, that pseudo-classically correlated states can be reversibly created from 
states with one null system. Simply, by definition pseudo-classically correlated states can be reversibly produced out 
of classically correlated states. The latter, in turn, can be reversibly produced out of one-subsystem states. Thus, 
consider optimal protocol for drawing local information. As we have argued, it can end up with a one-subsystem 
state. Out of the state we can reversibly create classically correlated state which is a special case of pseudo-classically 
correlated states. Conversely, suppose that we have a protocol, that ends up with a pseudo-classically correlated state. 
Then one can reversibly transform it into one-subsystem state. □ 

Thus quantum deficit is equal to minimum entropy production during making state to be pseudo-classically corre- 
lated by CLOCC operations. In other words, to draw optimal amount of local information from a given state, one 
should try to make it pseudo-classically correlated state in the most gentle way, i.e. producing the least possible 
amount of entropy. Once the state is pseudo-classically correlated, the further process of localization of entropy is 
trivial. The first stage is illustrated in figure ^ 



C. Defining cost of erasing entanglement 

The above formulation of the deficit allows one to generalize the idea of thermodynamical cost to other situations. 
Namely, instead of the set of pseudo-classically correlated states one can take any other set and ask the same question: 
how much entropy must be produced, while reaching this set by use of CLOCC. Thus our concept of localizing 
information allows to ascribe thermodynamical costs to other tasks than localizing information. With any chosen set 
we can associate a suitable deficit Aset- An important application of this concept is to take set of separable states. 
Then the associated deficit A sep has interpretation of thermodynamical cost of erasing entanglement. As such it is a 
good candidate for an entanglement measure. In this paper we will show that it is bounded from below by relative 
entropy of entanglement. Since set of separable states is a superset of pseudo-classically correlated states, we have 

A sep < A (21) 

so that the cost of erasing all quantum correlations is no smaller than cost of erasing entanglement. For sake of further 
proofs, let us put here formal definition of A sep 

Definition 6. The thermodynamical cost of erasing entanglement A sep is given by 

A sep (p) = M(S(p') - S(p) (22) 

where infimum runs over all CLOCC protocols V which transform initial state p into separable output state p' 
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VI. RELATIONS BETWEEN DEFICIT AND RELATIVE ENTROPY DISTANCE 

In this section we will present the proof of the theorem relating deficit to in terms of relative entropy distance 
obtained in [l5| . 

Theorem 1. The information deficit is bounded from above by the relative entropy distance from the set of pseudo- 
classically correlated states. 

A(g AB )< MS(g AB \a)=E^ c (23) 

where S(g\a) — trglogg — trg log a. 

Let us first prove the proposition 

Proposition 2. Localisable information and deficit satisfies the following bounds: 

Ii(g)>N- B M B H(g,B) (24) 

A(g) < inf H(q,B)-S(q) (25) 
BeiPB 

where H(p,B) denotes the entropy of diagonal entries of state p in basis B 

H(p,B) = -J2Pilog Pi , (26) 

i 

with pi = (ipi\p\ipi) , with tpi £ B. 

Proof. We will exhibit a simple protocol to achieve a reasonable amount of local information. Namely, Alice and 
Bob choose some implementable basis B and dephase a state in such basis. They can do this, as by definition, an IPB 
is a basis in which Alice and Bob can dephase by use of CLOCC. The final state has entropy 

S(p') = H(p,B), (27) 

Alice and Bob can now choose the basis, that will produce the smallest possible entropy H(p,B). In this way we 
obtain the following bound for A: 

A< inf H(p,B)-S(p) (28) 
~~ BeiPB 

This ends the proof of proposition. | 

Let us now express this bound in terms of relative entropy distance. This is done by the following lemma. 



Lemma 1. Given a state g, 



H(g,B)= inf S(g\a) + S(g), (29) 



where H{g,B) is the Shannon entropy of the probability distribution of the outcomes when g is measured in a given 
basis B and <Sg is the set of all states with eigenbasis B. 

Proof. We have 

inf S(g\a) + S{g) (30) 

<y£Se 

= in| (-tr(glog 2 o-)) 

= -tr(g B \og 2 g B ) + tr(g B \og 2 g B ) 
+ in | (-te{QB log 2 o-)) 

= S(qb) + inf S(g B \a) 

creSts 

= H(g,B). 

Here g$ is the state g dephased in the basis B. In the second equality, we have used the fact that tr(glog 2 CT) = 
tr(ge l°g2 <T )i because a is diagonal in basis B. In the fourth equality, we have used that g B belongs to the set £>g so 
that info-gSg S(g B \a) — 0, and also that S(g B ) = H(g,B). This ends the proof of the lemma. | 

Now combining the lemma with the proposition we obtain the above theorem. We have not been able to prove 
equality, and in subsection IVI CI we discuss the origin of the difficulties. 
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A. Deficit, cost or erasure of entanglement and relative entropy of entanglement 

In the previous section we have reproduced the result of [l5| which provided upper bound for deficit in terms of 
relative entropy distance from pseudo-classically correlated states. In this section we will prove a new result, providing 
a lower bound for the deficit in terms of an entanglement measure - the relative entropy of entanglement. 

Theorem 2. For any bipartite state p the quantum deficit is bounded from below by relative entropy of entanglement 

A(p) > E r {p) (31) 

To prove the above theorem it is enough to show that A sep - the cost of erasing entanglement - is lower bounded 
by E T1 which is the contents of the next theorem. Indeed, by definition of A sep and by the proposition ^ the deficit 
is no smaller than A sep . 

Theorem 3. For any bipartite state p the quantum deficit is bounded from below by relative entropy of entanglement 

A sep (p) > E r {p) (32) 
To prove this theorem we will need the following lemma. 

Lemma 2. Consider any subset S of states, invariant under product unitary transformations. Then relative entropy 
distance from this set given by 

E$ = inf S{p\a) (33) 

decreases no more than entropy increases under local dephasing, that is 

E?(p) - E?(A(p)) < S(A( P )) - S(p) (34) 

where A is local dephasing. 

Proof. Note first that local dephasing can be represented as mixture of local unitaries: 

Hp) = Y.p* u a® i bpUa ®Ib (35) 

Indeed, consider any set of projectors {Pj}\- The suitable unitaries are given by 

fc 

U(8i,...,8 k )=Y f a j P j (36) 
i=i 

where Sj = ±1 are chosen at random. Thus p^s are equal, but this is irrelevant for our purpose. 
Now, let us rewrite the inequality as follows: 

E*{ P ) + S(p) < S(A(p)) + E?(A(p)) (37) 

Thus we have to prove that function f(p) — E^(p) + S(p) is nondecreasing under dephasing. This is somehow parallel 
result to the result of |46J where it was proven that the above function does not decrease under (global) mixing. The 
proof is directly inspired by 01 ■ 
We have 

f(A(p))= inf-trA(p)loga= (38) 

CTfc O 



= inf ^^trpjoga > inf trpjlogfi 

i i 

= p l inf tip log u % = p l inf tip log a 



% i 

= f(p) 

where pi = U\ ® IbpU 1 ^ ® Ib and Oi = U\ ® Ibo~U\ ® Ib- The inequality comes from properties of infimum, the 
last but one equality comes from the fact that the set S is invariant under product unitary operations. This ends the 
proof of the lemma. | 
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Proof of the theorem^ The basic ingredient of the proof is monotonicity of the function f(p) = E r (p) + S(p) 
under CLOCC. (In entanglement theory important functions are the ones that cannot increase under suitable class 
of operations, while here we need a function that does not decrease under our class of operations. This once more 
shows that our approach is in a sense dual to the usual entanglement theory.) As we have already discussed, any 
CLOCC operation can be decomposed into basic ones: (i) local unitary transformation, (ii) local dephasing and (hi) 
noiseless sending of dephased qubits. Of course local unitary operation does not change either entropy or E r , so that 
the function / remains constant. The lemma we have just proved tells us that local dephasing can only increase 
the function /. Consider now the last component - sending dephased qubits. Clearly entropy again does not change 
during such operations. It remains to show that E r does not change under sending dephased qubits. Consider the 
state pabb' with one dephased qubit B' on Bob's site. Consider the closest separable state to the state a abb'- Since 
relative entropy of entanglement is in particular monotone under dephasings, we can choose this state to have the 
qubit B' dephased too. Consider then state paa'b, where A' qubit is the B' qubit after being sent by Bob. We now 
apply the procedure of sending qubit B 1 to the state gab b' and obtain a new separable state uaa' b ■ By construction 
we have S{pabb i \o'abb i ) = S(paa>b\o~aa'b)- Thus E r could only go down. However we can repeat the reasoning 
with the qubit sent in converse direction, and conclude that E r does not change. 

In this way we have shown that the function / cannot decrease under CLOCC operations. This means, that for 
any protocol that brings initial state p to a final separable state p' we have 

f(p') > f(p) (39) 

However the target state is separable, hence it has E r = 0. We obtain 

S(p')-S( P )>E r (p) (40) 

which tells us that in any protocol that ends up with separable state, the increase of entropy is no smaller than relative 
entropy of entanglement. This ends the proof. | 

B. Connection with bounds obtained via semidefinite programming 

In semidefinite programming techniques were used to obtain lower bounds on regularized deficit. The following 
general bound was obtained: 

A°°(p) > sup (- log 2 A max (|a r |) - S(g) - S(g\a)) (41) 

where A max denotes the greatest eigenvalue, and T is partial transposition of the matrix. The value of the bound 
has been calculated for Werner states and isotropic states. It turned out that for those states it is exactly equal to 
regularized relative entropy of entanglement. This is compatible with the theorem [3] It is interesting, what is general 
relation of the bound (|41|l with regularized E r . 

C. Discussion of the problem of "noncommuting choice" 

We have proved that deficit satisfies the following inequality 

E v r c >A>E r (42) 

Yet we have not been able to prove that A = Ef c . Let us discuss the main obstacles which we encountered. The 
question is actually as follows: Can there be better protocol than dephasing in optimal IPB basis? The latter protocol 
has some fundamental feature. Namely, in the series of the subsequent local dephasings, each dephasing is compatible 
with the previous one in the sense that they commute with each other. In other words, each dephasing is in some 
sense ultimate: it divides the total Hilbert space into blocks, so that all subsequent dephasings are performed within 
blocks, in the basis that is compatible with the blocks. Another way of viewing it is to say that what was sent from 
Alice to Bob or vice versa, will remain classical, that is diagonal in fixed distinguished basis. The main open question 
is now the following: Is it enough for Alice and Bob to follow this restriction, or whether they should violate this rule 
to draw more information? 

We can formulate this fundamental problem in a more tractable way, if we look through the proof of the theorem 
13 and find where the proof fails if instead of separable states one takes pseudo-classically correlated states. Almost 
the entire proof can be carried forward without alteration, apart from one small item: the invariance of E^ c under 
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sending dephased qubits. E r was invariant mainly because we could choose the closest separable state to be also 
dephased on that qubit. This is because set of separable states is closed under local dephasings. However the set of 
pseudo-classically correlated states is not. It does not rule out the possibility that indeed the closest pseudo-classically 
correlated state has the qubit dephased. However we were not able to prove it or disprove. We will formulate here 
the problem in a formal way: 

Problem. Consider bipartite state that can be written in the following form: 

PAB = V\p\b + P^Pab (43) 

where p\ B and p AB are orthogonal on subsystem A, i.e. the reduced states p\ have disjoint support. Can the closest 
pseudo-classically correlated state in relative entropy distance be written in this form? 



D. Deficit and relative entropy distance for one-way and zero- way scenarios 

Finally let us note that needed results can be obtained easily for one-way and zero- way scenarios. The problem with 
two-way is that Alice and Bob could draw more information than they obtain by measuring in optimal IPB basis. The 
source of difficulty was that in many rounds protocol, Alice and Bob could make dephasings that would not commute 
with dephasings they made in previous step. In the case of one-way there is no such danger, as there is only one 
round. The zero- way situation is simplest. The only thing Alice and Bob can do is to dephase the subsystems in some 
bases, and the only problem is to find optimal bases (so that they will produce the smallest amount of entropy). The 
versions of lemma ^ m one-way and zero- way case can be proven in the same way. Thus in those cases the deficits 
are equal to relative entropy distance to the two sets of states - classically correlated states and one-way classically 
correlated states (|T$)l . 



E. Multipartite states 

We can define set of pseudo-classically correlated states also in the case of multipartite states. Then one can 
formulate version of Theorem QJin the latter case. Since the arguments we have used did not depend on number of 
parties, Theorem is then true also in multipartite case. Similarly theorems [21 and [21 hold in the multipartite case. 



VII. BASIC IMPLICATIONS OF THE THEOREM (INFORMATIONAL NONLOCALITY) 

The theorems obtained in the previous section allow us to obtain the following results for both bipartite as well as 
multipartite states. 

• A is bounded no smaller than distillable entanglement Erj. 

A > Ed (44) 
Indeed, the latter is bounded from above by relative entropy of entanglement |4S[ . 

• Moreover, Theorem [3 implies that quantum deficit is no smaller than coherent information: 

A(p) > S(p x ) - S(p) (45) 

where X = A, B, C... or 

I(p) <N- S(p x ) (46) 

This is because it was proven that in bipartite case |4flj , relative entropy of entanglement is bounded from below 
by coherent information Sx — S. For multipartite states, one gets it by noting that multipartite relative entropy 
of entanglement is no smaller than the one versus some bipartite cut. Then one applies the mentioned bipartite 
result. 

• Any entangled state is informationally nonlocal, i.e. it has nonzero deficit. 



^■(Qentangled) > 



(47) 
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This follows form the fact that when a state is entangled, then it has nonzero relative entropy of entanglement. 
Note however that there exist separable states which are informationally nonlocal 

^inseparable) > 0, (48) 

for some separable states. We will now discuss an example of such state and relate it to so called "nonlocality 
without entanglement" . 

• Theorem allows for easy proof that for pure bi-partite states the deficit is equal to entanglement. Indeed, 
from the theorem we have that deficit is no greater than entanglement. On the other hand, a simple protocol 
of dephasing Alice's part in eigenbasis of the state of her subsystem, and sending the stuff to Bob gives the 
amount of information 21ogc? — S(pa)- Thus deficit is also no greater than entropy of subsystem. However the 
latter is equal to relative entropy of entanglement (this is reflection of the fact that in asymptotic regime there 
is only one measure of entanglement for pure states). For multipartite pure states there does not exists unique 
entanglement measure. We have the following open question: For multipartite pure states, is the deficit equal 
to relative entropy of entanglement? 

£ r (V) = A(V0 (49) 

If so, deficit would be an entanglement measure for all pure states. And since deficit is an operational quantity, 
we would have operational interpretation for relative entropy of entanglement for pure states. 

Note here that in general the deficit is not monotone under LOCC, and even under CLOCC. In contrast, Ii is 
monotone under CLOCC. 

• From the above reasoning and theorem [21 it follows that thermodynamical cost of erasure of entanglement of 
pure states is equal to their entanglement, (c.f. fl4lll5|P 



A. Non-locality without entanglement and with distinguishability 

One form of non-locality we are familiar with, is entanglement. Another form of non-locality was introduced in [lOfl : 
so-called non-locality without entanglement. There, it was shown that there are ensembles of states, which, although 
product, cannot be distinguished from each other under LOCC with certainty. Ensembles of product states can have 
a form of non-locality. Other ensembles were exhibited, which were distinguishable, but distinguishing was thermo- 
dynamically irreversible. This can be thought of of as non-locality without entanglement but with distinguishability. 
All those results were done for ensembles. 

Here we report similar kinds of nonlocality for states. Namely, we will exhibit states which are separable, and which 
can be created out of ensembles of distinguishable states but which contain unrealizable information such that A^O 
(at least for single copies). In fact, one can find such states which have an eigenbasis where each eigenket is perfectly 
distinguishable. 

An example is the state given by 

p=i|00)(00| + i|ll)(ll| + i|r)^-| • (50) 

It is a separable state, which can be seen either by construction, or because it has positive partial transpose which 
is a sufficient condition for dimension 2 2. It's eigenkcts |00), |11), are clearly perfectly distinguishable under 
LOCC, since Alice and Bob just need to measure in the computation basis and compare results to know which of the 
three basis state they have. Nonetheless, it clearly has non-localisable information. To localize all the information, 
one would need to dephase it in the basis |00), |11), |^~), but this cannot be done under CLOCC, since one cannot 
dephase using a projector on \ip~). The proof follows from Theorem^- we know that the optimal protocol is for Alice 
to dephase her side in some basis, and then send the state to Bob. Indeed, for two qubits, all implcmcntable product 
bases are one-way implement able, i.e. they are of the form {\i)\ip^)} where {IV^}} 

are bases themselves. Thus for 

the one copy case, which we consider here, the optimal protocol is one-way protocol. Since the state is symmetric, 
then it does not matter which way (from Alice to Bob or vice- versa). 

A direct calculation shows that the optimal basis is |0 ± 1) at one of the sites. This yields // = 3/41og3 — 1, 
while 7 = 1/2 giving a value of A = .1887. There are thus separable states which exhibit non-locality in that all the 
information cannot be localized even though all the basis elements of the state are perfectly distinguishable. 
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VIII. INFORMATIONAL NONLOCALITY OF MULTIPARTITE STATES 



The approach considered here turns out to be quite valuable in the case of multipartite states. One of the reasons 
for this is that one can not only quantify the quantumness of correlations along various splittings, as is commonly 
done, but one can also look at the total amount of localizable information that a given state possesses if all parties 
cooperate. In other words, in addition to the various vector measures defined for a particular splitting of the state 
eg. AB\CD, one also has a scalar measure which is defined for the state as a whole. One can calculate A for various 
bipartite splitting by grouping parties together, or one can calculate A for the entire state. In fact, one can consider 
all possible groupings, such as AB\CD\EF etc. This allows one to explore multipartite correlations in more detail, 
and also allows one to ascribe a single quantity to a particular state in order to rank various states in terms of their 
total quantum correlations. 

By considering a family of states for a number of parties N, one can calculate the information deficit per party 
A(pn)/N. and we find that it goes to zero for the generalized GHZ, and to infinity for the Aharonov state, as N 
goes to infinity. Of the states we consider, we shall thus find that the Greenberger-Horne-Zeilinger (GHZ) state is the 
least informationally nonlocal, while the so-called Aharonov state is the most informationally nonlocal. 



A. Schmidt decomposable states 



The information deficit for the N party GHZ state 

\iPnghz) = IH1...1) + |222...2) + . . . + \NNN...N) (51) 

where we depart slightly from convention by taking the dimension of each parties state to also scale like N. This 
state is thus more entangled than if one were to give each party a qubit, and we do so in order to fairly compare 
our results with other entangled states. The deficit for the GHZ was calculated in 0] where it was found to be 
A(ipNGHz) = log AT. Essentially, once one party makes a measurement, all the other parties can learn which state 
they have without performing a measurement, thus /; = (N — 1) log AT, while the total state is of dimension N N , 
hence I = N log N. Therefore 

lim A(tP NG hz)/N = (52) 

N— >oo 

This is in keeping with the notion that the GHZ is rather fragile, since if only one of the qubits becomes dephased, 
the entire state becomes classical. 

One can generalize this to any multipartite state which can be written in a Schmidt basis. I.e. 

N 

\^ NS )=Yl C iI[\ ( t > Ni) (53) 
i n—1 

In that case, one finds A(tpNs) = S{pa) where pa is any of the subsystem entropies (they are all equal). This follows 
directly from inequality (|46[l and it holds in the asymptotic regime of many copies. 



B. An example of a non-Schmidt decomposable state: The W state in three qubits 



A more complicated example is the "W state" 

\^ w )abc = — ^= (| 100> + |010) + |001>) 

and we ask the question of how much localizable information // can be extracted under one-way CLOCC by using it 
as a shared state. Since each party only has one qubit, we can use Theorem[2]to calculate it. This is because if each 
party only holds a single qubit, the optimal protocol will only need one way communication, and will be equivalent 
to having one party measure, and then tell her results to the other parties who will than hold a pure state between 
them. 

Let Alice measure her part of the state in basis {|e;)} and send the result to Bob and Charlie. After the measurement 

\tpw)ABC — ► 9 abc = ^2pi\ei)(ei \ <g> Q BC (54) 
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Then Alice obtains the ensemble {pi, |ej)}. Bob and Charlie obtain ensemble {pi, g BC }- g BC are of course pure states. 
Bob and Charlie know, which of the states {q 1 B q} they have, because they have obtained information about the result 
of the measurement by Alice. Therefore, the total amount of information, that can be extracted from \i/jw)abc 
locally, by such a protocol, is given by 

I{qa) + P\h (Qbc) + P2li{g B c) ( 55 ) 

where qa = J2iPi\ e i)( e i\ so that 

I{ QA ) = 1 - H({ Pi }) (56) 

and where (since g B are pure) 

I l (g BC ) = 2-S(g B ), (57) 

with g B being the reduced density matrix of g BC - So for an arbitrary von Neumann measurement, we have that for 
the W state, is given by 

2 

IMw) = 3-H({ Pi })-J2PiS(g B ) 

1 

= 3 -.ff(^-4 

3 ( 2 2 + 2|x| 2 ' 

2-N 2 1 y/4\x\*-Z\x\* 
3 4 4-2|x| 2 h 

where the measurement is performed in the basis {|ej)} given by 

|ei) - s|0)+y|l> 
|e 2 ) = y*\0)-x*\l). 

One can check that for von Neumann measurements, the largest amount of local information cxtractable is 1.45026. 
It is achieved for measurement in the basis {|ej)}, where either x 2 = 1/3 or x = 2/3 (see Figure IVIIID|I . Contrary to 
naive expectations, dephasing in the computational basis is the worst choice. Also the basis |±) (x = 1) is not optimal. 
It is interesting, that optimal bases are not incidental. Rather these are those bases for which probabilities of transition 
into |0), |1) states are the same as the probabilities of getting those states by Alice measuring W state in basis |0), |1). 
In the regime of single copies, this protocol is optimal by Theorem |3 therefore for the W state, /; = 1.45026. This 
is less than the amount of localisable information for the corresponding GHZ state \tpGHz) = -^(jOOO) + |1H)), thus 
we would argue that the W-state exhibits more non-local correlations. 



C. The Aharonov state and quasi-unlocalisable information 



We next consider the so called, Aharonov "diamond" state, it is essentially given by anti-symmetrizing N N- 
dimensional states. For three parties, the unnormalized state is 

\ip 3A ) = |012) - |021) + |120) - |102) + |201) - |210) (58) 

and in general it is 

\^ NA )=^ e ai - QN k-aJv> (59) 

* permutations 

where e a i - a « [ s ^ ne permutation symbol (Levi-Civita density). 

It has the property that if one party measures their state in any basis, and tells their result to the rest of the parties, 
they will then still hold another Aharonov state of dimension N — 1. Since this is a pure state of dimension N N , 
the total amount of information is / = iVlogiV. On the other hand, under the protocol where the parties take turns 
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measuring, it is easy to see that after each measurement, the other parties will still be left with a locally maximally 
mixed state. Finally, however, there will be two parties left, and they will share a singlet, which can be converted 
into 1 bit of localized information. 

The amount of localisable information is therefore Ii = 1 regardless of how large N is. This is optimal by Theorem 
13 for single copies. We thus have that A(iPna)/N = log A — 1/N which grows logarithmically to infinity with N. 
The amount of localisable information per dimension goes very fast to zero as N~ N . The Aharonov state can then 
be thought of as a form of unrealizable information. One might wonder if one can make the localisable information 
strictly zero, as is the case for entanglement with bound entangled states. We will soon show that this is not the case. 



D. General pure three qubit states 

In subsection I VIII Al we considered the localisable information of Schmidt decomposable states. And in subsection 
IVIH Bl we considered the W state, an example of a non-Schmidt decomposable state. 

Let us here consider the general three qubit pure state, which can be written in the form |5lll52| 



\tp) ABC = a |000) + b |010) + c |100) + d |001) + e |111) , (60) 

where only a need be complex, while the rest of the coefficients are real. Of course we have | a | 2 + b 2 + c 2 + d 2 + e 2 = 1. 

We again can use Theorem|2to obtain the amount of localisable information, let us suppose that Alice (A) measures 
in the basis 

| ei > - x\0)+y\l) , s 

|ea) - y*\0)-x*\l), ^ 

and sends the measurement outcome to Bob (B) and Charlie (C). 

Depending on the measurement outcome, Bob and Charlie share the state 

\ip ei ) = -=((a?*o + y*c)|00) + x*d|01> + x*b\10) + y*e\ll)) 

Vp 

or 

|Ve 2 ) = -^={(ya - xc)\0Q) + yd\01) + yb\10) - xe\ll}) 

corresponding to the outcome |ei) or \e%) at Alice, where 

p = \{x*a + y*c)\ 2 + \x\ 2 d 2 + \x\ 2 b 2 + \y\ 2 e 2 

is the probability that \e\) is obtained by Alice. 

For such a protocol, the localisable information amounts to 

II = - H{p) - pS{tY A \^ ei ){^ ei \) 

-(l-p)S(tr^ e2 )(^ e2 |)] (62) 

where we maximize over x and y to obtain the highest localisable information. This is an optimal protocol, and thus 
we obtain Let us denote the quantity in square brackets as I[ v . 

Let us find the value of the localisable information for the case of the W state \ipw), using Eq. I|62() . Without 
loss of generality, we may write x = r > and y = exp(i<^)Vl — r 2 . We now plot, in Fig. IVIII PI the expression 
on the (r, </>)-plane. The supremum can then be read off from the figure. This supremum will then correspond 
to the localisable information for the W state. Interestingly, the supremum is attained on two parallel lines on the 
(r, 0)-planc. 

Let us now choose an exemplary one-parameter subclass from the class in Eq. (|60|l : 

a = e = 0, 6 = 0.1. 



For this class, we plot the localisable information /; using real values of x and y. Taking x = r > and y = yl — r 2 , 
If v is plotted (in Fig. IVIII Dfl as a function of r and c. For a given c (which then fixes the state), the value of /; can 
be read from the figure. 
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FIG. 2: Plot of If versus x 2 for measurement in basis 1611 for W-state. The optimal basis for maximizing If is for Alice to 
dephase (or measure) with x 2 = 1/3 or 2/3. The basis |±) (x 2 = 1/2) is not optimal. 




FIG. 3: Plot of the function If y (Eq. Kil't for the three-qubit state in Eq. 101 H for the case when a = e = 0, b — 0.1, on the 
(c, r)-plane. Here x = r , y — \/l — r 2 , and r > 0. The value of localisable information 7; for a given c is the supremum of If v 
for that value of c. 

IX. BELL MIXTURES 

The state of eq. (150(1 is a particular example of a mixture of Bell states 

|^) = (|00) ± |11)), 
l^> = ^(|01)±|10)). 

Here, for completeness, we calculate A for all states of this for - so-called Bell-diagonal states. Up to local unitaries, 
this includes all 2 ® 2 states with local density matrices that are maximally mixed. Due to Theorem [21 we only need 
consider optimizing over projection measurements (without adding any ancilla locally) at one of the parties, say Alice. 
Consider therefore the mixture 

QBm = V\P^ + P2P4,- + + PiP,p- (63) 

of the four Bell states in 2 ® 2. 

After an arbitrary projection valued (PV) measurement on Alice's side, projecting in the basis 

{|D) = o|0) + 6|l), \T) =b\0) -a\l)} , 

let the global state be projected respectively to 



P\a)®Qo, P\t)®Qi- (64) 
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At this stage, the whole state is essentially on Bob's side. This is because we allow dephasing as one of our allowed 
operations. Consequently, the locally extractable information after this set of operations is the von Neumann entropy 
of 

P-P|o) ® Qo + (1 - p)P\j) ® Qi, 
where p is the probability of Alice obtaining the state 1 0) . The optimization yields the value 

A = l + H( Pl +p 2 )-S(g Bm ) (65) 

where p\ and p 2 are the two highest coefficients of the Bell mixture gg m . 

If we consider only von Neumann measurements (without addition of ancilla) and if Alice and Bob are not allowed 
to make any communication before they perform their measurements, then the zero-way information deficit A® for 
the Bell mixtures Ij63|l is given by 

1 + H(p max ), 

where 

Pmax = ~(1 + |max{tn,t 2 2,*33}|) 

with ta = tr(<7i ® UiQBm)- Note however that in this case, we are unable to show whether one can do better by 
POVMs or whether more copies are useful. 
Consider however the isotropic d ® d state 

Qiso = A \<t>max) + (1 - A)^ (66) 

in d ® d, where <f> m ax is the maximally entangled state in d ® d which is invariant under U ®U* for any unitary U. 
The one-way information deficit A^ (as well as A ) is given by 



1-A,. 1-A, 

T 

1 - A , 1-A 



A =A- = (A+— ^)log 2 (l + — — ) (67) 



+{d - 1)— -T- lo S2 — lo S2 d + S(g lso ) 

where 

S( &0 ) = -(A + i-^)log(A + i-^)- (68) 
d 2 - 1 n ... 1-A 

For the isotropic state, it is possible to prove, on the same lines as for Bell mixtures, that POVMs as well as more 
than one copy cannot help. 



A. Asymptotic regime 

For two qubits we easily evaluated deficit, because, one-way and two-way deficits are equal in this case, because 
Alice's first measurement leaves no room for other measurements. So the only thing she should do is to communicate 
results to Bob, and communication from Bob is not needed. Put it in other words: set of pseudo-classically correlated 
states is equal to the one-way classically correlated states of the form (|T^|l . Thus it was enough to evaluate only 
one-way deficit. However if we turn to regularization, this equivalence is no longer valid. This is because, to calculate 
rcgularization, one needs to evaluate deficit for many copies. Thus the dimension of the system is high, and there is 
room for many rounds, we are not able to regularize two-way deficit. 

Concerning one-way deficit, one can argue that it is additive for Bell diagonal states. Moreover, borrowing qubits 
that borrowing qubits does not help (It has been independently shown that in general, in one-way case, borrowing 
pure local qubits does not help |2fJ). We will provide the argumentation in section IXIV Al 
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X. PURELY UNLOCALISABLE INFORMATION DOESN'T EXIST 



One important aspect of entanglement theory is the existence of bound entangled states. These are states which 
are entangled, in that they require entanglement to create, yet no entanglement can be drawn from them. In Section 
IVIII CI we saw that in the multipartite case, there were states which the amount of localisable information per party 
went to zero as the number of parties increased. One can ask whether there is a strict analogy to bound entanglement: 
are there states which have positive /, but which // = 0. It turns out that the answer is no; the only state which 
has // = is the maximally mixed state. Here we prove this in the following lemma for the case of two parties. The 
generalization to many parties is straightforward. 

Lemma 3. ^From any state other than the maximally mixed state we can draw local information. 

Proof. Consider a state g ~ C d ® C d such that g ^ g m mix — 4r , then there exists an observable for which the mean 
value in state g has a different value than g mm ix- Every nonlocal observable can be decomposed into local operators, 
so we can always find such an observable of the form A® B for which: 



Tr(A® B)g^Tr{A®B) — (69) 

gt 

Then 

Tr(A <g> B)g ^ ^-TrATrB (70) 
d 2 

ij ij 

Notice that distribution of probability for g in (|71(l is classical. We know that we can obtain a nonzero amount of 
local information from any classical state besides the maximally mixed one. We can see that we are able to find such 
local operation, that transforms every state which agrees with the assumptions of lemma|21 into a state from with we 
can draw local information. □ 

There is an open question, whether there exist states, for which localisable information is entirely equal to local 
information content, but which nevertheless are not product. In such case, one wouldn't be able to draw information 
from correlations at all. The classical deficit A c would be zero, even though state would be non-product. It is rather 
unlikely that such states exist, yet we have not been able to solve this question. 

We now prove a related theorem which follows from the above lemma, and which will be useful for the following 
section. Namely, we show that using pure states as a resource cannot help when distilling local information. One can 
think of such a process as catalysis where one uses pure states to produce more pure states from some shared state. 

Theorem 4. Local pure ancillas do not help in process of distilling local information. 



Proof. Assume that catalysis can help in drawing local information. Consider a state g, which is not the 
maximally mixed state and the optimal protocol of distilling local information Pi, which do not use ancillas. 
Consider also another protocol P2, in which we distill information from some of the copies of state g. Using 
Pi and then use the distilled pure states to do catalytic distillation on the rest of copies. Notice, that we can 
do this, because we know from lemma [3] that we can distill local information thus also pure states from it. If 
catalysis is helpful that means that using Pi we are able to obtain more local information than in previous proto- 
col. Protocol P2 does not use ancillas and is better than Pi, which is optimal. This leads to the required contradiction. 

We showed that catalysis is useless for state with nonzero distillable information. It could help only in cause of 
states with pure unlocalisable information, but we know from Lemma |21 that such states do not exist. This ends the 
proof. 

Remark. We know that to do catalytic distillation we need pure ancillas. One can notice that states, which 
we want to use in protocol Pi to do catalysis are not exactly pure. But these states come from distillation, so they 
are equal in the limit of many copies to |0)® r ™ (r is rate of distillation of local information and n is the amount of 
copies). This fact assure us that in asymptotic regime of many copies we are able to catalysis. 
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XI. CAN CORRELATIONS BE MORE QUANTUM THAN CLASSICAL? 

The total amount of correlations contained in a bipartite state is given by the mutual information 

hi = S( PA ) + S( PB ) - S( PAB ) (72) 

One can easily see that our quantities for dividing correlations into ones which behave quantumly (A) and classically 
(A c ) satisfy 

/ M =A + A C . (73) 

In other words, the total amount of correlations (given by Im) can be divided into classical and quantum components. 
Now one can ask whether the total correlations Im can be divided arbitrarily. Certainly for pure states, this is not 
the case. For pure states, correlations which behave quantumly cannot exceed 1/2. For pure states ip, we showed 
that A = S(pa), and thus it is always the case that A(i/>) = Im/2. For pure states, the quantumness of correlations 
can never exceed the classicalness of correlations. 

Now one can ask: Can it be that one has states for which 



Mqab) > 1/2? (74) 

If so, one could think of these states as having super-saturated quantum correlations, in that for a given amount of 
mutual information Im they have a greater proportion of correlations which behave quantumly. In this sense, one 
can think of such states as being more non-local than maximally entangled states. 

One way of approach to the above problem is to work with relative entropy of entanglement. We know that both 
the relative entropy of entanglement (E r ), with distance taken from separable states, and the von Neumann entropy 
(Sab) are not greater than log 2 d for d ® d states. Consequently, one has E r + Sab < 21og 2 d. Can we have the 
following stronger inequality: 

2E r + S AB < 2 log 2 d. (75) 

This is tight for maximally entangled states. Because deficit is no smaller than relative entropy of entanglement, it 
follows that if the inequality is violated, then for some states inequality (|74|l is true, and we would have the curious 
phenomenon. On the other side, when the inequality is satisfied for all states, we would obtain a nice trade-off between 
entanglement and noise. 

In a recent work, Wei el al. [53| calculated (for two-qubit states) the maximal possible relative entropy of entangle- 
ment E r (as well as other entanglement measures) for a given amount of mixedness (quantified by the von Neumann 
entropy). Note that the inequality l|75l) would generically hold for two-qubit states if it is satisfied by these optimal 
values. Indeed examining the curves of the above paper, one finds that for any two qubit state the inequality is 
satisfied. 

One can also find that for Werner states, and maximally correlated states, the inequality is satisfied too, for 
regularized relative entropy of entanglement. To see this, the asymptotic relative entropy of entanglement {E^ PPT ^) 
(with distance taken from states with positive partial transpose (PPT)) is known for Werner states (mixture of 
projectors on symmetric and antisymmetric spaces) in d <£> d J54(. One may check that the relation 

2E% (PPT) + S AB <21og 2 d (76) 

is satisfied for all Werner states in arbitrary dimensions. However, note here that the relative entropy of entanglement 
(from PPT states) is not additive for Werner states. 

For the maximally correlated states, relative entropy of entanglement (from PPT states) is known to be additive. Its 
value is also explicitly known for all such states in d (g) d. Via additivity, this would exactly be equal to its asymptotic 
relative entropy of entanglement (from PPT states) . Precisely, for any state of the form 

Qmc = ^2a i: j \ii) , 

we have 

Er(ppt) = Er(ppt) = ^ a u l°g2 a u ~ S(g mc )- 

i 

It is easy to check that the relation l|76|) is satisfied by any g mc in d ® d. 

Thus we haven't found states for which the inequality would be violated for regularized relative entropy of entan- 
glement. It remains an open question whether the trade-off between noise and entanglement represented by inequality 
is universally true, or whether there exist states, for which there is more quantum than classical correlations. 
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XII. ZERO- WAY AND ONE-WAY SUBCLASSES 



We now turn to additional measures of quantumness of correlations which arise when one restricts the communica- 
tions between Alice and Bob. In sections llXI IVIIll such restrictions were useful for evaluations of perhaps more basic 
two-way quantities. However they are more than just for ease of calculation - we shall also see that the restricted 
measures allow one to explore other aspects of non-locality. Additionally, there appears to be strong connections 
between the deficit and distillation of randomness from shared states. For example, it has just been shown in po| 
that the one-way deficit is equal to the mutual information minus the one-way distillable randomness pifij . 

As before, the optimal protocols by which the corresponding local informations are obtained amounts to producing 
"classical-like" states of least entropy by the respective operations. As mentioned in section IV Bl the theorems proven 
there apply equally well in these restricted scenarios with suitable modification. 

In any protocol of concentrating information to local form, the parties can stop at states of the form 

Q' A B=Y,Pv\i)(i\®\j)(3\- (77) 

However for two-way scenario, we have argued that one can stop already at pseudo-classically correlated states. When 
one-way protocols are allowed, it is sufficient for the parties to stop at states of the form 

QAB=^Pi\i){i\®Qi- (78) 

i 

Finally for zero- way protocols, one has to achieve classical states (|77fl . Consider for example the zero- way protocol 
for a state qab by which if is attained. Without any classical communication (just by dephasing via an environment), 
Alice and Bob change the state qab into a classical-like state g' AB (of the form given in eq. lj77|0 . so that S(g' A ) + S(g' B ) 
is minimized, where g' A and g' B are the local density matrices of g' AB . Note that the parties must concentrate 
information using classical communication. But this is only after they have performed all their dephasings. The 
situation is therefore like in a Bell- type experiment. 

Let us now show that A® is an independently useful candidate for quantum correlations and can capture interesting 
aspects of non-locality. The states that contain no quantum correlations would be then the ones with A = 0. 
Consider for example the states with eigenbasis (without normalization) 

\0)a \°)b > \0)a Mb , |1>a d°) + l 1 ))^, \l) A (|0) - \1)) B , (79) 

where |0) and |1) are the eigenvectors of the Pauli matrix a z . Such states are the ones used in the BB84 quantum 
cryptography protocol Q . This set of orthogonal states are distinguishable locally. But they are not distinguishable 
by zero-way communication. Bob must wait for Alice's measurement result (in the Oj-basis) to decide whether to 
perform a measurement in the <7 z -basis or in the er^-basis. Therefore a mixture of the states in eq. (|79|l . where the 
mixing probabilities are all different from each other (so that the spectrum of the resulting state is non-degenerate), 
would have nonvanishing A". This is because an arbitrary dephasing by Bob on such a mixture, before obtaining 
Alice's result would result in no information being extracted from the state (by Bob). Consequently there would be 
an information deficit when trying to extract information locally, because globally of course all the information is 
extractable from such a state. All the information is also extractable by one-way or two-way communication. This is 
contrast to states which have an eigenbasis 

|0) A |0) B ,|0) A |I) B ,|i) A |0) B ,|i)Ji) B , 

for which all the information is extractable from the state locally, by measurement by both the parties in the er z -basis, 
without any communication. 

We therefore see that the quantum behavior of correlations could result from the distinctly quantum but "local" 
property of nonorthogonality. Here we call nonorthogonality a local property, as it does not a priori require a tensor 
product structure to manifest itself. It is this nonorthogonality that manifests itself in a more complex form in the 
examples of LOCC-indistinguishable orthogonal product bases jlfl. l5a . l57l . More generally, it may be the reason for 
any case of LOCC-indistinguishability of orthogonal states [Eli, l59l l60t l6ll l62l| . 

An interesting issue is relation between A" and mutual information. In section lXll we have asked a question whether 
there exist states for which A would be more than half of mutual information. The same question can be asked in the 
case of one-way and zero-way deficits. Lukasz Pankowski has performed numerical simulations to evaluate A" versus 
mutual information. The results are presented on figurc^J Surprisingly, there are states, for which the deficit is almost 
equal to mutual information. Thus the measurement destroys almost all correlations! The quantum correlations do 
not imply classical correlations (see 63] in this context). 

With respect to the pure states considered in Section IIV Fl it is easy to see that A is also equal to A - * . This is 
also true for single copies of single qubit states, due to Theorem [21 
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FIG. 4: Zero-way deficit is plotted versus mutual information for 100 000 random two qubit states. The upper line stands for 
A = Im. The lower line denotes isotropic states of eq. 1661 . Two regimes are evident: in first regime, there are states, for 
which deficit is almost equal to mutual information. 

A. Expression for one-way A 

In this subsection we consider the expression for the one-way deficit In the case when only one-way communication 
is allowed between the parties, the only thing that Alice and Bob can do, is that Alice dephases her part in come 
basis, and then sends her part to Bob. Dephasing transforms the state as 

Qab -* q'ab = ^Pi® iQAsPi ®I = y^Pi \i) (i\ <8> Q % B 

i i 

where {Pi — \i) forms a set of orthogonal one-dimensional projectors on the Hilbert space of Alice's part of qab 
and pi are probabilities of the corresponding outcomes which Alice would obtain if she performed measurements with 
the same Pi's rather than dephasing, while g B is the state that Bob would obtain conditionally on measurement 
outcome Thus 

Pi = tx{ QAB Pi®I) (80) 

g B = —tr A (Pi <g) IqabPi ® I) 
Pi 

The process of sending does not change the form of the state, so that the entropy of the final state at Bob is 

S(q ab ) = S(q' a ) +J2PiS(Q i s) 

i 

where g' A — ^2 i Pi \i) (i\ is the reduced density matrix of the A-part of q'^b- S° finally If* takes the form 

If = riAB - iaf (S(q'a) +Y^IhS(q%)) 

i 

and correspondingly 

Af = mf (S(^) + Y^PiS((? B )) - S(qab). (81) 

i 

Just as we showed that A was equal to the relative entropy distance to pseudo-classically correlated states, one can 
also write A^ and A as the minimum relative entropy distance to the set of states and S which can be created 
reversibly under the one-way and zero way classes of operations. 
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XIII. RELATIONSHIP WITH OTHER MEASURES OF QUANTUMNESS OF CORRELATIONS 

Let us now compare the deficit with other measures of quantumness of correlations, in particular the quantum 
discord 0, 0] . The latter is defined formally, as the difference of two classically equivalent expressions for the mutual 
information, applied to quantum systems (taken to be a measuring apparatus and system) . It was defined with respect 
to a measurement Aju (either a projective one, or a POVM (Positive Operator Valued Measure) performed on the 
apparatus A. One then defines the discord S(Aj^\B) with respect to this measurement that results with probabilities 
Pi in joint states g'\ B = {^i\) A <8 gf ■ The discord is defined as 

S(A M \B)=H({ Pi }) + J2PiS(Qi)-S(g A B) ■ (82) 

i 

The relationship between 8{Am\B) and (defined on single copies) was recently shown in |64| where it was 
shown that the discord also has the interpretation of extraction of work by a demon, if one minimizes 6(Am\B) over 
all possible measurements A_m- Care however must be taken, since with the definition of discord there is no cost 
associated to pure states which are used in a POVM. Therefore, we note here that the relationship between the discord 
and only applies if one optimizes the discord over von Neumann measurements, and disallows POVMs. 

Finally, let us provide two explicit examples of cases where two way communication is more powerful than 1-way 
communication. I.e one has the strict inequality A" > A^ = in f 4 V fpy mPM 6(Am \B) 

To this aim consider the basis related to the sausage states of 0| an d which has been analyzed in [ltij: 
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Consider now any bipartite 3 <8> 3 state Qtwo—way that is diagonal in the above basis, but has nondegenerate spectrum. 
It is relatively easy to provide a two-way protocol that distinguishes vectors l|83[l without destroying them (see H3)- 
Hence A~ vanishes. Evidently gtwo-way is not of the form JV=i I0i)(0i| ® Qi with orthogonal <pi, since there are no 
three eigenvectors among (18311 that have the same component on Alice's side. So both A^ and discord are strictly 
positive for this state. Thus Maxwell's demon which communicate in both directions are more powerful than demon's 
who can only communicate in one direction. 

Another simple example is to take states which have zero optimized discord or one-way deficit 

P^ = ^2pi\i)A(i\A ® Pb =^2PjP 3 A®\j)B(j\B (84) 

i i 

but in different directions of communication. Then take them each to be on orthogonal Hilbert spaces, and mix. 
Such a state will have A^ = since both parties can just project on the two orthogonal Hilbert spaces to determine 
whether they hold p^ or p^- and then the appropriate party can send her state down the channel. On the other hand, 
one-way communication will be sufficient to completely localize one of the states but not always both. 



XIV. RELATION WITH MEASURES OF CLASSICAL CORRELATION 

In this section we shall analyze the relation of the classical deficit [lfj to already known measures of classical 
correlations. It happens that both zero-way and one-way deficit have their "counterparts" in such measures. There 
is no known analog, however, for two-way deficit. 

Let us recall that just as the quantum deficit was defined as 

A = / - I, 
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One can think of it as describing how much better Alice and Bob can do under closed operations (CO) if they are 
given a quantum channel instead of the classical channel. Because it feels the difference between the quantum and 
classical channel, it tells us about the quantumness of correlations. Likewise, the classical deficit is given by 

A c = Ii - I LO 

It tells us how much better two parties can do at localizing information if, instead of having no access to a channel 
i.e. closed local operations, they have access to a classical channel. Because the added resource is a classical channel, 
it shows how much better the parties can do by exploiting a classical channel. 

One can verify that A c and A add up to the quantum mutual information Im(qab) = S(qa) + S(qa) — S(qab)- 
Thus 

A d = I M - A 

More explicitly we have (cf eq. dJ) 

A cl (g AB ) = S(qa) + S( QB ) - c M cc (S( Q ' A ) + S(g' B )) (85) 
i.e. A c i is the optimal decrease of local entropies by means of CLOCC. 

A. One-way measures 

Corresponding to the measure of quantumness of correlation under one-way classical communication (from Alice 
to Bob) (A - *), given by eq. Q8ip. we could have the following formula for classical correlation: 

A c C(q A b) 

= sup Pi [{S{ QA ) - S{q' a )} + {S(g B ) - PiS(Qh)}] (86) 
= sup P .[SS{A)+SS(B\A)]. 

Note that the supremum is taken over all local dephasings on Alice's side. Although we optimize over projection 
measurements, one can effectively include POVM's by including all the required ancillas from the start. Remarkably, 
it has been shown |2(j that POVM's need not be considered when one goes to to the limit of many copies. 
In eq. iJSSJ, we have distinguished two terms. The second term 

5S(B\A) = S(q b ) -YtPiStfg) 

i 

shows the decrease of Bob's entropy after Alice's measurement. The first one 

SS(A) = S( QA ) - S(q'a) 

denotes the cost of this process on Alice side, and is non-positive. It is zero only if Alice measures in the eigenbasis 
of her local density matrix qa- 

The expression for A c f^ is very similar to the measure of classical correlation introduced by Henderson and Vedral 

m 

C HV = sup(5( to ) - V PiS(g B )). (87) 

Pi 

Originally the supremum was taken over POVMs, but as mentioned we take the state acting already on a suitably 
larger Hilbert space, unless stated otherwise explicitly. 

The difference between the Henderson- Vedral classical correlation measure and one given in eq. I|86|l is that the 
former does not include Alice's entropic cost 5S(A) of performing dephasing. Hence in general, 

A^ < A HV . 

In the asymptotic limit of many copies, one has equality pcj . Actually in |2(| it was shown that regularized one-way 
classical deficit is equal to another operational measures of classical correlations: distillable common randomness 
introduced in [5f|. The latter is in turn equal to regularized Henderson- Vedral measure. It is interesting that 
without regularization, although seems to be an important characteristics of classical correlations, does not meet a 
basic requirement for being a measure of classical correlations: it is not monotonous under local operations |40|. Thus 
regularization plays here a role of monotonization. There is interesting question what happens with two-way classical 
deficit after regularization. 
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B. Additivity of one-way quantum and classical deficits for Bell diagonal states 

Here we will prove the fact mentioned in section TlX Al that the one-way deficits are additive, and that borrowing 
pure qubits does not help for Bell diagonal states. First of all in [6{| it was shown that a measure of classical 
correlations Chv is additive for Bell diagonal states. Let us recall the argument, as it will be useful for making 
connection with classical deficit. For a Bell diagonal state p, consider a related channel A (i.e. such channel that 
(/(g) K)(\4>) + = g). The maximum output Holevo function over all input ensembles, denoted by X*(A) is, for 
general channels, no smaller than Chv- They are equal, if the density matrix of ensemble attaining \* is equal to qa- 
In the case of Bell diagonal states, we have qa = 1/2, and it turns out that the optimal ensemble for corresponding 
channels consists of two orthogonal states, hence gives rise to the same matrix. King j6(| has shown that x* is additive 
for channels coming from the Bell diagonal states. ^From this and from the fact that, in general, x* > Chv one gets 
that for Bell diagonal states Chv for many copies is also equal to x* for many copies of corresponding channels. This 
proves that Chv must be additive. 

Now, let us make connection with classical deficit. As discussed in ^fji if X* is attained on such ensemble that its 
density matrix is equal to qa, then by looking at ensemble maximizing x, one can tell something about measurements 
that attain Chv- Namely, when the ensemble is orthogonal, then one attains Chv by measurement in eigenbasis of 
qa- Now, it is obvious from eq. (|5rj|l and discussion thereafter, that in the latter case Chv is actually equal to classical 
deficit, as they differ from one another only by entropy production during Alice's measurement, which vanishes, if it 
is done in eigenbasis. Since Chv is additive, then for many copies it is again attained by measurement in orthogonal 
basis that is eigenbasis of Alice's subsystem. Thus classical deficit for many copies is also not less than Chv, and it 
by eq. (|86|l cannot be greater. 

Thus for Bell diagonal states the deficit is equal to Chv and it is additive. Moreover, since the measurement was 
von Neumann one, the deficit is attained without using POVMs. This means that additional pure ancillas do not 
help. 

So far we have talked about classical deficit. Now, since quantum and classical deficit add up to mutual information 
which is additive, it follows that quantum deficit is additive too. Also, since borrowing local qubits does not increase 
classical deficit, it cannot decrease quantum deficit. 

C. Zero- way measures 

Let us now consider measures of classical correlations under no classical communication, A c ; . Again, this is taken 
to mean that the parties are not allowed to communicate before making measurements, but can do so afterward in 
order to concentrate the classical records. The information deficit under no classical communication, A , is given by 

A = S(qab) - S{q' ab ) 

where S{q' ab ) is the von Neumann entropy of the optimal final state q' ab (which is classical-like), and was obtained 
by local complete measurements, without classical communication. We then obtain 

A J = S(q a ) + S( QB ) - S(q' ab ) 
= {S(qa)-S(q' a )} + {S(q b )-S(q' b )} 

+{S(e' A ) + S(q'b) - S(q'ab)} 
= 5S{A)+SS(B)+I M (p') 

We have three terms here: the last one 

Im( P ') = S(q' a ) + S(q' b )-S(q' ab ) 

is the classical mutual information of the final state, while the first two, SS(A) = S(qa) — S(g' A ) and SS(B) = 
S(qb) — S(g' B ), denote respectively the local entropic costs of the process at the respective sides. We therefore have 
a trade-off similar to that in the one-way case. And again there was defined a classical correlation measure ,65j] which 
consists only of the last term of our quantity 

C =su P / M (e') (88) 

Pi 

where q' is obtained out of q by local complete measurements. Again the original definition of C involved POVMs, 
but as we have suitably increased our Hilbert space from the very beginning, we need not do so. 
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XV. COMPLEMENTARITY FEATURES OF INFORMATION IN DISTRIBUTED QUANTUM 

SYSTEMS 

Bohr was the first who recognized a fundamental feature of quantum formalism - complementarity between in- 
compatible observables. Complementarity was not explicitly related to entanglement, now regarded as an important 
quantum information resource. Namely, Bohr's complementarity concerned mutually exclusive quantum phenomena 
associated with a single system and observed under different experimental arrangements. 

Let us comment on complementarity in the case of composite systems and Bohr complementarity. Roughly speaking, 
the latter says that one cannot access the properties of the systems necessary to describe it by one measurement. The 
rule is formulated for single quantum systems and is a consequence of noncommutativity. 

On the other hand we know that one can also divide the properties of the system into local and nonlocal ones, 
and they are complementary with each other too |16|| . For example, one can perform measurement in Bell basis or in 
standard product basis. However one cannot perform those measurements simultaneously. In other words one cannot 
access global and local properties of the system (see also in this context). 

The latter phenomenon is not merely a consequence of Bohr's complementarity. Indeed, if the only allowable states 
of composite systems were the classically correlated states: 



then maximal information about the total system would be available through measurements on subsystems. Global 
measurements would not access any further knowledge about properties of the system. On the other hand, Bohr 
complementarity would still hold, in the sense that one cannot access all properties of the system in one measurement. 

Thus we see that the local-nonlocal complementarity |16I | is a consequence of two distinct phenomena: noncommu- 
tativity and existence of entanglement (or quantum correlations). So not only is there noncommutativity, but there 
is too much of it, so that it affects also relations between local and nonlocal informational contents. 

In distributed systems one usually imposes constraints by allowing operations that can be done solely by classical 
communication and local operations. It turns out that in such situation there also arises an interesting complemen- 
tarity Namely, in fl(| we considered two tasks: localizing information (which we have presented in this paper) and 
sending quantum information (e.g. teleportation), performed simultaneously. It was shown that for a fixed protocol 
V, the rates of those two tasks obey the following relation 



where /; (V, p) is the amount of information localised by the protocol V and Q(V, p) is the amount of qubits transmitted 
by the protocol. 

For example, for the singlet state, the total informational contents is equal to total correlation contents and amounts 
to two bits. The right hand side of the inequality is equal to 1. This number 2 in the light of the above complementarity 
we can interpret as follows: 2 is equal not to 1 plus 1 but it is equal to 1 or 1. One can cither draw one bit of local 
information (classical correlations) or teleport one qubit (quantum correlations), however we cannot access both bits. 

One can see that this phenomenon is connected with above-mentioned Bohr complementarity for distributed sys- 
tems: for the task of teleportation, Alice makes a Bell measurement on her part of the singlet and the unknown state 
to be sent, while to localize information, she measures only the half of singlets. Interestingly, as far as those two 
exclusive measurements are concerned, the "local versus nonlocal" complementarity occurs within Alice laboratory, 
while it results in complementarity between tasks that refer to local-nonlocal properties of systems belonging to Alice 
and Bob. 

The above inequality suggests an interesting problem: to find the trade-off curves for performances of teleportation 
and localizing information of a given state. In particular, an interesting question is whether there exist states for 
which if wc teleport the amount of qubits equal to distillable entanglement, one not only would not localize any 
information, but would need to spend some additional pure states (see |l7j in this context). 



In conclusion we have developed the quantum information processing paradigm which involves local information 
as a natural resource in the context class of CLOCC operations. We have presented proof that the central quantity 
of the paradigm, quantum information deficit is bounded from above by the relative entropy distance from the set 
of pseudo-classically correlated states. We showed how the paradigm allows one to define thermodynamical cost of 
erasure of entanglement: entropy production necessary to make state separable by CLOCC operations. We proved 




(89) 



Ii(V,p) + Q(V,p)<I l (p) 



(90) 



XVI. DISCUSSION AND OPEN QUESTIONS 
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that the cost is no smaller than relative entropy of entanglement. Since the cost is no greater than the deficit, we 
have obtained that the deficit is no smaller than relative entropy of entanglement. This in turn implies that every 
entangled state exhibits informational nonlocality. 

We have also found that the paradigm offers a new method of analysis of correlations of multipartite states. The 
most nonlocal state from this point of view (we call it informationally nonlocal) would be the one for which one has to 
produce the largest entropy while converting it into classical states. It turned out that according to such a criterion, 
the Aharonov state is much more nonlocal than GHZ one. The nonlocality that can be probed by our methods is one 
that is not caught by Bell's inequalities, since we have found that also separable states can exhibit nonzero deficit. 
Rather, it has much in common with nonlocality without entanglement, that was found for ensembles of states [To| . 
Thus our nonlocality is not identical with entanglement. As a matter of fact it is a wider notion. 

The information deficit has then some peculiar properties. Since it is not an entanglement measure, it can increase 
under local operations. It is not unreasonable: Local operations may destroy a local property, and make it impossible 
to carry out some action by separated parties, while when the parties meet, the action may still be achievable. This 
curious behavior of quantum states may be attributed to the fact that even for separable states, when they are mixtures 
of nonorthogonal states, we cannot ascribe to the subsystems local properties (this may have some connection with 
the Kochen-Specker theorem). 

The paradigm developed in this paper opens many important questions. Here are some of them. 

• Are "noncommuting- choice protocols" better in localizing information? This is the major problem in the 
paradigm of localizing information by CLOCC operations. 

• Is the quantum deficit equal to relative entropy distance to pseudo- classically correlated states? This question 
would be answered positively, if the noncommuting-choice protocols do not help. 

• Is regularized deficit still nonzero for all entangled states? For regularized deficit we have lower bound given by 
regularized relative entropy distance. However we do not know if for any entangled state the latter is nonzero. 

• Is deficit for multiparty pure states equal to relative entropy of entanglement? For bipartite states it was proven 
that the deficit is equal to entanglement. For multiparty case it is also true for Schmidt decomposable states. 
It is an open problem whether it is true in general. The same question can be asked for regularized deficit. Is 
it equal to regularized E r for multipartite pure states? 

• Is two-way classical deficit a legitimate measure of classical correlations? The classical deficit definitely is 
important quantity describing some aspects of classical correlations. However there is a question, whether it 
can be used to quantify them. To this end, it should not increase under local operations [l3j . For one way 
case, the classical deficit is not monotonous under local operations as shown in |40(. Yet it turns out that after 
regularization, the monotonicity is regained |20| , because regularized one-way classical deficit is equal to one-way 
distillable common randomness of [55|. Can two-way classical deficit be also monotonous after regularization? 
This is connected with the next question: 

• Is classical two-way deficit equal to two-way distillable common randomness \53jj? 

• Is relative entropy of entanglement the thermodynamical cost of erasure of entanglement? We have shown that 
the cost is bounded from below by relative entropy of entanglement. If there is equality, relative entropy of 
entanglement would acquire operational status: it would be interpreted as thermodynamical cost of erasure of 
entanglement. 

• What is the relation between deficit and mutual information? We have shown that if a trade-off inequality 
for E r H75fl would be violated, then quantum deficit would be more than the classical deficit for some states. 
We have also touched on this question by analysis of zero-way deficit versus mutual information. Preliminary 
results suggest that there is very interesting phenomenon while going from quantum to classical states via 
local measurements: for some states before measurement there are large correlations quantified by mutual 
information, while after measurement, the remaining amount of information is equal almost exclusively to 
initial local information. This means that for some states, even optimal measurement may destroy most of 
information contained in correlations. The question can be recast in the following way: how small can be the 
classical deficit versus mutual information? 

In [6^ measure of classical correlations (|88|l closely related to zero-way deficit was compared with mutual 
information. The authors showed that when this measure is smaller than e then mutual information is smaller 
than e poly(d) where d is dimension of the Hilbert space. They were however unable to improve the factor to 
be of order of log d. This means that most probably there is place for dramatic divergence between the two 
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measures of correlations. Since deficit can be only smaller from the measure of (|88ll . the effect can be even 
stronger. All that suggests that there may be a large gap between the classical and quantum. 

• A fundamental open problem, or rather program is to analyze complementarity between drawing local infor- 
mation and distilling singlets initiated in [lq . In the latter paper, the two tasks: drawing local information 
and teleporting qubits were treated as complementary ones. One obtains trade-offs, if one wants to perform 
those tasks simultaneously. An open question is whether distilling singlets can lead to negative amount of local 
information gained, i.e. whether in process of distillation we have to use up local pure qubits rather than we gain 
them Moreover one can define the following quantity: maximal amount of pure qubits one can draw by 
CLOCC from a given state |l9(. Note that here we do not speak about local qubits. Thus for example, singlet is 
already pure and needs no action. Due to reversibility in entanglement transformations for pure bipartite states 
|35| . the question in fact reduced to the problem of drawing simultaneously singlets and local pure qubits. 

• An interesting question arises in the context of 68] . There the authors probe correlations by applying random 
local unitaries to transform the state to product or separable form, using the smallest number of unitaries. 
This method allows to define not only quantum correlations but also total correlations in terms of entropy 
production while reaching some set of states. It differs from our approach in that the authors do not use 
classical communication in an essential way (it cannot help). Therefore a natural application of their method 
is to probe total correlations. This allows them to give a fresh, operational meaning to the quantum mutual 
information - it is the entropy production needed to bring a quantum state into product form. Our method 
could be applied in a similar way - one tries to bring a state into product form using CLOCC but without the 
classical communication (i.e. CLO) Then one finds that the entropy production (i.e. deficit to product states 
^■prod) i s equal to I(pab)- This can be seen simply from the fact that the optimal protocol is for one party to 
locally compress her state and then to dephase in the eigenbasis of the compressed state. She then dephases in a 
basis complementary to the eigenbasis. The latter measurement completely destroys all correlations between A 
and B. Since the initial entropy was S(paB) and the final entropy is S(A) + S(B), the deficit and hence entropy 
production, is I(pab)- Just as the relative entropy distance to some set of states (pseudo-classically correlated, 
and separable states) played a crucial role in the case of A and A sep , here, the relative entropy distance to 
product states plays the crucial role, and is equal to the quantum mutual information. 

It is rather amusing that this gives the same answer as the method used in |68|. since in our cases, Alice 
performs her measurement without any knowledge of the density matrix of Bob, while in |6Sj. she must use this 
information. Furthermore, the number of unitaries which would be needed to perform the dephasing in our case, 
is S(A) 2 , far greater than the optimal number found in 68] . Understanding in greater detail why these two 
methods give the same answer might be an interested avenue of further research. It is also interesting to compare 
how one divides the total correlations into quantum and classical ones. For example, in the case of the singlet, 
68] interpret the two bits of mutual information as requiring one bit of noise to destroy the entanglement, and 
one bit of noise required to destroy the secret correlations. In |l6j ] we interpreted the two bits in terms of one 
use of a quantum channel, or one bit of local information. 

In the case of destroying correlations due to entanglement, our method uses classical communication in an 
essential way, therefore on the surface, it appears to naturally encode the notion of entanglement whose definition 
relies on the class of LOCO For pure states the authors of [6^ also obtain entanglement, as in this case 
communication is not needed to reach the set of separable states. It is interesting then to compare what those 
both approaches would produce as far as entropic cost of erasing entanglement is concerned. One could expect 
that our method will show less cost in the case of erasing entanglement. 

Finally we strongly believe that the present, novel paradigm analyzed and developed here will be helpful as a new 
rigorous tool in searching for a border or rather a way of coexistence between quantumncss and classicality in physical 
states. It may also enrich our understanding of quantum information processing and its relation to other branches of 
physics like thermodynamics and statistics. 
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